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CRYSTAL STRUCTURE OF GLUTAMATE RACEMASE (Murl) 
Inventors: Marie Anderson, Stewart Lindsay Fisher, Rutger Henk Adriaan Folmer, 
Gunther Kern, Rolf Tomas Lundqvist, Trevor Newton, and Yafeng Xue. 

5 

RELATED APPLICATIONS 

The present application is related to U.S. Provisional applications 
60/435,272, entitled "Crystal Structure of Glutamate Racemase (Murl) from Gram 

10 positive Bacteria"; 60/435,167, entitled "Crystal Structure of Glutamate Racemase 
(Murl) from Gram negative Bacteria"; 60/435,087, entitled "Crystal Structure of 
Glutamate Racemase (Murl) from Helicobacter pylori"; and 60/435,527, entitled 
"Crystal Structure of Glutamate Racemase (Murl) from Helicobacter pylori 
Complexed with Inhibitors", each of which was filed on December 20, 2002. The 

15 entire teachings of each of the referenced applications are incorporated herein by 
reference in their entirety. 

BACKGROUND OF THE INVENTION 

20 Certain species of Gram negative bacteria are important human pathogens, 

and the prevalence of their association with human disease is increasing. Extensive 
antibiotic resistance has developed in gram-negative bacteria through three basic 
mechanisms, alteration of drug target, drug inactivation, and thirdly, reduction of 
cell membrane permeability either due to altered porins or the acquisition or 

25 induction of efflux pumps (Waterer, Ibid.). Species such as Pseudomonas 

aeruginosa, Acinetobacter spp., Stenotrophomonas maltophila, and members of the 
Enterobacteriaceae are particularly problematic in intensive care units (Waterer and 
Wunderink, Crit. Care Med. 29: 75-81 (2001)). 

Chronic pulmonary infection with Pseudomonas aeruginosa is the major 

30 cause of lung function decline and mortality in cystic fibrosis patients and is also a 
major problem in severe burn victims (Lyczak et al. Clin. Microbiol. Rev., 15: 194- 
222 (2002); and Lyczak et al. Microbes Infect. 2: 1051-1060 (2000)). 
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The microbial etiology of urinary tract infections has been well established. 
Escherichia coli remains the predominant uropathogen (80%) isolated in acute 
community-acquired uncomplicated infections, followed by Staphylococcus 
saprophyticus (10-15%), Klebsiella, Enterobacter, and Proteus species (Ronald, A. 
5 Am. J. Med. 8; 1 13 (Suppl.) 1 A:14S-19S (2002)). 

Certain species of Gram positive bacteria are important human pathogens 
and the recent development of broad spectrum antibiotic resistance among these 
organisms has been identified as a critical human health issue (McDevitt and 
Rosenberg, Trends in MicrobioL 9: 611-617 (2001)). Members of the Enterococci, 

10 including Enterococcus faecalis and Enterococcus faecium, are agents of 

endocarditis, and urinary tract, bloodstream, and wound infection (Harbath et al. 
Antimicrob. Agents Chemo. 46: 1619-1628 (2002)). Species such as Staphylococcus 
aureus, Streptococcus pneumoniae, and Streptococcus pyogenes are major causes of 
respiratory tract infections, including sinusitis, otitis media, bacterial meningitis and 

15 community- acquired pneumonia and represent a leading cause of morbidity and 

mortality world-wide (Paradisi et al. Clin. Micro. Inf. 7(Suppl4): 34-42 (2001); and 
Mcintosh, K. N. Engl. J. Med., 346: 429-437 (2002)). The development of 
resistance to current, commonly used antibiotics has risen steadily over the past two 
decades with rates exceeding 60-80% in some countries (Applebaum, P.C. Clin. 

20 Infect. Diseases, 34: 1613-1620 (2002)). The prevalence of these resistant 

organisms has limited treatment options in the clinic and the use of "last resort" 
antibiotics such as vancomycin has increased dramatically as a result. The 
emergence of Enterococci species that harbor resistance genes capable of high level 
glycopeptide resistance (Harbath et al. Ibid; and Linares, H. Clin. Micro. Inf. 

25 7(Suppl. 4): 8-15 (2001)) has demonstrated that Gram positive pathogens are 

capable of developing resistance to all known therapies and that future infections 
may be untreatable without the development of new antibacterial agents. 

Helicobacter (H.) pylori infections are one of the most common bacterial 
infections in humans and are the causative agent of peptic ulcers, gastric MALT 

30 lymphoma, dyspepsia, gastroesophageal reflux disease, and other diseases of the 
upper gastrointestinal tract, and has been linked to the development of gastric 
adenocarcinoma. Establishment of the bacteria in the upper gastrointestinal tract 
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causes continuous gastric inflammation and induces a vigorous systemic and 
mucosal humoral response which causes tissue damage, but does not eradicate the 
bacteria (Sauerbaum and Michetti, N. Engl. J. Med. 347(15): 1 175-1 186 (2002)). 
An approach to disabling or killing H. pylori would be beneficial. 
5 Hwang et al. {Nature Structural Biology, 6(5): 422-426 (1999)) report the 

crystal structure of Murl from Aquifex pyrophilus, determined at 2.3 A resolution. 

SUMMARY OF THE INVENTION 

10 The structure of bacteria includes a peptidoglycan layer, located between the 

cytoplasmic and outer membranes of the cell wall, which is crucial to the structural 
integrity. Peptidoglycan is a large polymer, an essential component of which is the 
D amino acid, D-glutamate. The glutamate racemase enzyme (Murl) catalyzes the 
reversible interconversion of L- glutamate to D-glutamate and, thus, plays an 

15 important role in cell wall synthesis and bacterial growth. Because peptidoglycan is 
unique to bacteria, it and the enzymes involved in its biosynthesis are of interest as 
targets for designing or identifying antibacterial drugs. 

Described herein are the three-dimensional structure of Murl (Murl) from 
Gram negative, Gram positive, and atypical bacteria such as Escherichia (E.) coli; 

20 Enterococcus (E.) faecalis; Enterococcus (E.) faecium; Staphylococcus (S.) aureus', 
and Helicobacter (H.) pylori; binding domains of Murl; conserved sequences of 
Murl; methods of identifying or designing agents that bind Murl (e.g., binding 
agents, ligands, drugs, or inhibitors that partially or totally inhibit Murl activity, 
proteins, small organic molecules); methods of crystallizing Murl; computer- 

25 assisted methods of identifying, screening, and/or designing agents that bind Murl; 
the use of the crystals in the preparation of a medicament for the treatment of 
bacterial infections, pharmaceutical compositions and packages; methods of treating 
bacterial infections in subjects comprising administering inhibitors of Murl; and 
methods of conducting business. 

30 Gram negative bacteria include, for example, Escherichia species, 

Haemophilus influenzae, Klebsiella pneumoniae, Moraxella catarrhalis, Vibrio 
cholerae, Proteus mirablis, Pasteruella multocida, Acinetobacter baumanii, 
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Bacteroides fragilis, Treponemal species, Borrelial species, Deinococcal species, 
Pseudomonas species, Salmonella species, Shigella species, Yersinia species and 
Porphyromonas gingivalis. 

Gram positive bacteria include the bacteria of Bacillus species, 
5 Staphylococcal spp., Streptococcal species, Enterococcal species, Lactobacilli, 
Pediococci, and Mycobacterial species. More specifically, Gram positive bacteria 
include, for example, B. subtilis, S. aureus, E.faecalis, and E.faecium. 

Atypical bacteria include, for example, Helicobacter species, Campylobacter 
jejuni, and Aquifex aeolicus. 

10 

BRIEF DESCRIPTION OF THE FIGURES 

Figures 1 A-1H present topology of the Murl fold and various conformations of the 
enzyme. Figure 1 A illustrates the three dimensional structure of one domain of 

15 Murl depicting structural elements and illustrates the topology of the Murl fold. 

Figures IB and 1C provide a cartoon depiction of Murl in an open (IB; black) and 
closed (1C; gray) conformations. Figures ID and IF depict the tail-tail structure of 
Gram positive Murl in different conformations. Figure IE depicts the head-head 
structure of atypical Murl. Figure 1G depicts the structure of Gram negative Murl 

20 having both a substrate binding site (left side of 1 G) and an activator binding site 
(right side of 1G). 

Figure 2A-20 depict the structural elements which are conserved in Murl across all 
Murl, across Gram-positive and atypical bacteria, or all Gram-positive bacteria by 
25 specific amino acid residues. Structural motifs such as helices (H) beta sheets (E), 
loops (S) and turns (T), are indicated. 

Figure 3 shows the distance (A) between the active site cysteines of Murl crystal 
structures in different conformations. Measures of conformational movement are 
30 shown in columns 3-5, and the angles associated with conformational movement are 
shown in columns 6-7. 
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Figure 4A-4ZZZ show the listing of the three-dimensional atomic coordinates of a 
model derived from a H. pylori Murl (Murl) crystal structure. The coordinates 
presented are those of both chains of the dimeric protein. In the figures, the atom 
listing is preceded by the heading CRYST1, which is followed by the 3 dimensions 
5 (see Figure 4A) of the crystallographic unit cell. The next three values define a 
matrix which converts co-ordinates from orthogonal Angstrom coordinates to 
fractional coordinates of the unit cell. Each row labeled ATOM gives the (arbitrary) 
atom number, the label given to each amino acid main chain, each atom type, the 
amino acid residue type, the protein chain label (A comprises the first molecule 

10 (chain) and B comprises the second molecule (chain)), and the amino acid residue 
number. The first three numbers in the row give the orthogonal X, Y, Z coordinates 
of the atom. The next number is an occupancy number and would be less than 1 .0 if 
the atom could be seen in more than one position (the amino acid could be seen in 
more than one orientation). The final number is a temperature factor which relates 

15 to the thermal amplitude of vibrations of the atom. At the end of the listing, there 
are lines of data indicating the ordered water molecules (TIP or WAT) included in 
the model. 

Figures 5 A-5ZZZ show the listing of the three-dimensional atomic coordinates of 
20 the crystal structure of Murl (Murl) from H. pylori complexed with D-glutamate. 

Figures 6A-6AAAA show the listing of the three-dimensional coordinates of the 
crystal structure of Murl from H, pylori complexed with glutamate and the 
pyrimidinedione inhibitor, compound A. 

25 

Figures 7A-7AAAA show the listing of the three-dimensional coordinates of the 
crystal structure of Murl from H, pylori complexed with the pyrimidinedione 
inhibitor, compound A. 

30 Figures 8A-8OO show the listing of the three-dimensional atomic coordinates of a 
E. coli Murl (Murl) crystal structure complexed with glutamate. 
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Figures 9A-900 show the listing of the three-dimensional coordinates of the crystal 
structure of Murl (Murl) from E. coli complexed with activator. 

Figures 10A-10NN show the listing of the three-dimensional coordinates of a model 
5 derived from a crystal structure of Murl (Murl) from E. coli. 

Figures 1 1 A-l lOO show the listing of the three-dimensional coordinates of the 
crystal structure of Murl (Murl) from E. coli complexed with glutamate and 
activator. 

10 

Figures 12A-12000 show the listing of the three-dimensional coordinates of a 
model derived from a crystal structure of Murl (Murl) from E.faecalis. The 
coordinates are both chains of the dimeric protein. 

15 Figures 13A-13000 show the listing of the three-dimensional coordinates of crystal 
structure of Murl (Murl) from E.faecalis complexed with D,L- glutamate. 

Figures 14A-14MMM show the listing of the three-dimensional coordinates of a 
model derived from a crystal structure of Murl (Murl) from S. aureus. 

20 

Figures 15A-15MMM show the listing of the three-dimensional coordinates of the 
crystal structure of Murl (Murl) from S. aureus complexed with D-glutamate. 

Figures 16A-16JJ show the listing of the three-dimensional coordinates of a model 
25 derived from a crystal structure of Murl (Murl) from E.faecium. 

Figures 17A-17II show the listing of the three-dimensional coordinates of the crystal 
structure of Murl (Murl) from E.faecium complexed with tartrate. 

30 Figures 18A-18JJ show the listing of the three-dimensional coordinates of the 
crystal structure of Murl (Murl) from E.faecium complexed with citrate. 
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Figures 19A-19JJ show the listing of the three-dimensional coordinates of the 
crystal structure of Murl (Murl) from E.faecium complexed with phosphate. 

Figure 20A provides a nuclear magnetic resonance (NMR) [ 15 N, ! H] correlation 
5 spectrum of 0.3 mM 15 N-labeled H. pylori Murl protein. 

Figure 20B provides a nuclear magnetic resonance (NMR) [ !5 N, ] H] correlation 
spectrum of 0.3 mM 15 N-labeled H. pylori Murl protein, to which 0.6 mM D- 
glutamate was added. 

10 

Figure 20C provides a nuclear magnetic resonance (NMR) [ 15 N, *H] correlation 
spectrum of 0.3 mM 15 N-labeled H. pylori Murl protein, to which 1.8 mM D- 
glutamate was added. Box 1 shows the two tryptophan side chain amide groups. In 
this spectrum with saturating conditions of the D-glutamate, the tryptophans take up 
15 a unique conformation. In Figures 20A and 20B, multiple conformations are visible 
(i.e., more than two peaks). Box 2 shows signals which are shifted to low-field 
NMR frequencies. Such signals are indicative of a high degree of structural content. 

Figure 21 A provides a nuclear magnetic resonance (NMR) [ 15 N, *H] correlation 
20 spectrum of 0.3 mM 15 N-labeled H. pylori Murl protein, to which a saturating 
amount of compound A was added. 

Figure 21B provides a nuclear magnetic resonance (NMR) [ 15 N, ! H] correlation 
spectrum of 0.3 mM 15 N-labeled H. pylori Murl protein, to which a saturating 
25 amount of D-glutamate was added. 

Figure 21 C provides a nuclear magnetic resonance (NMR) [ 15 N, , H] correlation 
spectrum of 0.3 mM 15 N-labeled H. pylori Murl protein, to which saturating 
amounts of D-glutamate and compound A were added. 

30 

Figure 2 ID provides an overlay of the spectra shown in Figures 21 B and 21C. 
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Figure 22 provides a pylogenetic analysis of Murl orthologs indicating dimeric 
(Gram positive), monomelic (Gram negative), and atypical dimeric structures by 
genus and species. 

5 

DETAILED DESCRIPTION OF THE INVENTION 

I. Overview 

10 One of ordinary skill in the art would recognize that solving crystal 

structures of proteins such as Murl requires a stable source of high-quality protein. 

As described herein, Murl of three classes of bacteria (Gram negative, Gram 
positive, and atypical) has been crystallized and the crystal structures (three- 
dimensional structure) determined. 

15 

II. Polypeptides, Crystals and Space Groups 

Crystallization of Murl from H. pylori, E. coli, E.faecalis, E. faecium, and S. 
aureus has been previously described in detail in related U.S. Provisional 
20 applications 60/435,272; 60/435,167; 60/435,087; and 60/435,527, filed on 

December 20, 2002, each of which is incorporated herein by reference in its entirety. 

One embodiment of the invention relates to an isolated polypeptide of a 
portion of Murl which functions as a binding site when folded in the proper 3-D 
orientation. 

25 The terms "peptide", "polypeptide" and "protein" are used interchangeably 

herein. These terms refer to unmodified amino acid chains, and also include minor 
modifications, such as phosphorylation, glycosylation and lipid modification. 
"Isolated" (used interchangeably with "substantially pure") when applied to 
polypeptides means a polypeptide or a portion thereof which, by virtue of its origin 

30 or manipulation. By "isolated" it is further meant a protein that is: (i) synthesized 
chemically; (ii) expressed in a host cell and purified away from associated and 
contaminating proteins; or (iii) purified away from associated and contaminating 
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proteins. The term generally means a polypeptide that has been separated from other 
proteins and nucleic acids with which it naturally occurs. Preferably, the 
polypeptide is also separated from substances such as antibodies or gel matrices 
(polyacrylamide) which are used to purify it. 
5 Each of the isolated polypeptide sequences can be a native sequence of Murl, 

or a sequence that is at least 70%, 75%, 80%, 85%, 90%, 95%, 98% or 99% 
homologous to the amino acid sequence represented by any one of SEQ ID NOS: 2- 
34, 40, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, and 74. 

10 A. Gram negative - E. coli 

One of ordinary skill in the art would recognize that solving crystal 
structures of proteins such as Murl (also referred to as Murl) requires a stable high- 
quality protein. 

One embodiment of the present invention relates to a crystal of Murl. In one 

1 5 embodiment, the present invention is crystallized E. coli Murl complex ed with L- 
glutamate. One embodiment of the present invention is crystallized E. coli Murl 
complexed with glutamate characterized by the structural coordinates depicted in 
Figure 8. One embodiment of the present invention is crystallized E. coli Murl 
complexed with activator characterized by the structural coordinates depicted in 

20 Figure 9. One embodiment of the present invention is crystallized E. coli Murl 

characterized by the structural coordinates depicted in Figure 10. One embodiment 
of the present invention is crystallized E. coli Murl complexed with glutamate and 
activator characterized by the structural coordinates depicted in Figure 1 1 . 

One embodiment of the crystallized complex is characterized as belonging to 

25 the orthorhombic space group C222i and has cell dimensions of a = 83.05 A, b = 
1 12.82 A and c = 74.12 A, wherein a = 90°, 0 = 90°, and y - 90°. Another 
embodiment of the crystallized complex is characterized as belonging to the 
monoclinic space group P21 and has cell dimensions of a = 70.04 A, b = 74.13 A 
and c = 70.10 A, wherein a = 90°, (5 = 107.25°, and y = 90°. 

30 One embodiment of the present invention is a crystal of E. coli Murl, 

wherein the Murl is at least 70%, 75%, 80%, 85%, 90%, 95%, 98%, or 99% 
homologous to the amino acid sequence represented by SEQ ID NO: 40, or portions 
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thereof. In the present invention, the crystals diffract from about 0.8 A to about 3.5 
A. 

The Murl protein exists as a monomer, but inspection of the electron density 
map identified additional molecular symmetry. Each monomer has a pseudo two- 
5 fold symmetry that divides the monomer into two domains with very similar 
alpha/beta type folds. The binding site is found in the interface between the two 
domains. On the opposite side of the monomer, but still in the interface, is the 
binding site for an activator, such as UDP-N-acetylmuramyl-L-alanine (UDP- 
MurNAc-Ala), wherein the activator acts as a wedge to stabilize an active 
10 conformation in which the two domains are properly oriented to promote binding of 
substrate. 

In the present invention, a substrate can be a compound such as L-glutamate, 
D-glutamate which Murl reversibly converts compounds from the R- to S- 
enantiomer. Thus, a substrate can also act a product. The substrate can be a 

15 naturally-occurring or artificial compound. 

In the present invention, an inhibitor can be a compound which also may 
undergo a catalytic reaction, or which binds to the substrate binding site or another 
site on Murl and which competes with substrate turnover of glutamate. Inhibitors of 
the present invention can be a compound such as L-serine-O-sulfate, D-serine-O- 

20 sulfate, D-aspartate, L-aspartate, tartrate, citrate, phosphate, sulfate, aziridino- 
glutamate, N-hydroxyglutamate, or 3-chloroglutamate. The inhibitor can be a 
naturally-occurring or artificial compound. 

In the present invention, an activator can be a compound such as UDP- 
MurNAc-Ala. The activator can be a naturally-occurring, or artificial compound. 

25 The activator has a compact structure when bound to the protein. It folds 

back on itself into a two- layered structure where two phosphate groups act as a 
bridging link. The uridine ring stacks against the muramic acid ring. There is 
almost a perfect shape match between the protein and activator as it resides on top of 
the two connecting loops between the two domains, in sequence corresponding to 

30 residues 112-116 and 225-228. In the absence of activator, the two loops act as an 
interface between the two domains. 
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However, the binding is not entirely driven by favorable van der Waal 
interactions. There are a number of polar interactions and at least one strong salt 
bridge between Argl04 and the carboxylate group of the alanine motif of the 
activator. The latter salt bridge could be a key interaction for the activator in order 
5 to be able to lock the two domains into the proper orientation, explaining the need 
for the terminal alanine residue in order for the molecule to work as an activator. 

Other extensive polar interactions are found between the uracil ring and main 
chain atoms of the protein, residues 113-115. There are also a number of hydrogen 
bonds from the hydroxyl groups of the two sugar rings. Many interactions between 

10 inhibitor and protein are mediated by water molecules. In contrast, the two 

phosphate groups make very little contact with the protein since they are facing 
towards the solution. However, on each side of the diphosphate group is a positively 
charged residue, Lysl 19 on one domain, and Arg233 on the other domain, which 
provides another example of domain-domain stabilizing interaction. 

15 A further embodiment relates to a E. coli Murl in which the substrate binding 

site comprises two conserved cysteine residues, denoted Cys92 and Cys204 in the 
amino acid sequence represented as SEQ ID NO: 40. A further embodiment 
comprises a substrate binding site of E. coli Murl wherein the binding site 
additionally comprises one or more of the following amino acid residues: Ser29, 

20 Thr94, Thrl35, Thrl38, Glul70, Thr205 and His206 as represented by the structural 
coordinates of Figure 8. In one embodiment, the substrate binding site of the E. coli 
Murl complexes with L-glutamate and comprises amino acid residues Cys92 and 
Cys204 as well as amino acid residues within 5 A of the Cys92 and Cys204 residues 
as represented by the structural coordinates of Figure 9. In one embodiment, the 

25 substrate binding site of the E. coli Murl additionally comprises one or more of the 
following amino acid residues: Ser29, Thr94, Thrl35, Thrl38, Glul70, Thr205 and 
His206. In this embodiment, the substrate binding site complexes with L-glutamate 
and comprises amino acid residues Cys92 and Cys204 and at least one (i.e., one or 
more) of the following amino acid residues: Ser29, Thr94, Thrl35, Thrl38, Glul70, 

30 Thr205 and His206, which can be present in any combination. In a further 

embodiment, the substrate binding site includes amino acid residues Phe27, Asp28, 
Ser29, Gly30, VaBl, Gly32, Gly33, Ser35, Val36, Asp54, Ala57, Ala57, Phe58, 
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Pro59, Tyr60, Gly61, Glu62, Lys63, Ile68, Val90, Ala91, Cys92, Asn93, Thr94, 
Ala95, Ser96, Thr97, Vail 14, Vail 15, Leul33, Alal34, Thrl35, Argl36, Glyl37, 
Thrl38, Vall39, Thrl44, Alal63, Vail 66, Glul67, Glul70, Leu202, Gly203, 
Cys204, Thr205, His206, Phe207 and Ser227of SEQ ID NO: 40. 
5 In a further embodiment, the substrate binding site of the Murl comprises 

two hydrogen bond TRIADs, which occur close to the conserved cysteine residues 
of the binding site. Specifically, on one side (Cys204) of the binding site, the 
TRIAD is Glul70-Thr205-His206 and on the other side (Cys92), the TRIAD is 
Thr94-Thrl35-Thrl38. Thus, in a specific embodiment of this invention, the 

10 substrate binding site of E. coli Murl complexes with L-glutamate and comprises 
amino acid residues Cys92 and Cys204 and additionally includes at least one (i.e., 
one or more) of the following: amino acid residues Ser29, Thr94, Thrl35, Thrl38, 
Glul70, Thr205 and His206. 

The threonine residues have features of interest as they relate to the binding 

15 site of the E. coli Murl. On one side (Cys204) of the substrate binding site, Thr205 
is H-bonded to His206 and His206 is further H-bonded to Glul70. The hydroxyl 
oxygen (O) of Thr205 is 2.8 A away from the amino nitrogen (N) of the substrate 
(i.e., glutamate) and 5.3 A from the sulfur (S) atom of Cys204, which is 4.3 A from 
His206. On the other side of the substrate binding site (Cys92), Thr94 is H-bonded 

20 to Thrl35 and further H-bonded to Thrl38. The hydroxyl O of Thr94 is H-bonded 
to one of the carboxylate oxygen atoms of the substrate and is 3.3 A away from the 
S atom of Cys92. Analysis showed that the three hydroxyl oxygens form a triangle, 
all within less than 3.2 A from one another. 

The two TRIADs may play important roles in altering the pKa of the two 

25 substrate binding site cysteine residues (in addition to that of the neighboring 
hydrophobic core), facilitating the proton transfer during catalysis or both. 



B. Gram positive 

30 



1. E.faecalis 



- 13- 



ASZD-PO 1-007 

One of ordinary skill in the art would recognize that solving crystal 
structures of proteins such as E. faecalis, E. faecium, and S. aureus glutamate 
racemase (Murl) requires a stable source of high-quality protein. 

One embodiment of the present invention relates to a crystal of Murl from a 
5 Gram positive bacterium. 

One embodiment of the present invention relates to crystallized E. faecalis 
Murl characterized by the structural coordinates depicted in Figure 12. One 
embodiment of the present invention discloses crystallized E. faecalis Murl 
complexed with D- and/or L-glutamate characterized by the structural coordinates 
10 depicted in Figure 13. 

One embodiment of the crystallized complex of E. faecalis Murl and D- or 
L-glutamate is characterized as belonging to the orthorombic space group P2\2\2\ 
and has cell dimensions of a = 60.29 A, b = 82.08 A and c = 1 15.57 A, wherein a = 
90°, P = 90°, and y = 90°. This embodiment is encompassed by the structural 
15 coordinates of Figure 12, or in complex with D- and/or L-glutamate as encompassed 
by the structural coordinates of Figure 13. 

One embodiment of the present invention is a crystal of E. faecalis Murl 
complexed with the product (substrate) D- and/or L-glutamate wherein the Murl is 
at least 70%, 75%, 80%, 85%, 90%, 95%, 98%, or 99% homologous to the amino 
20 acid sequence represented by SEQ ID NO: 44, or portions thereof. In the present 
invention, the crystals diffract from about 0.8 A to about 3.5 A. 

One embodiment of the E. faecalis crystallized complex with D- and/or L- 
glutamate is characterized as belonging to the orthorombic space group P2\2\2\ and 
having cell dimensions of a = 60.29 A, b = 82.08 A and c = 1 1 1 .57 A, wherein a = 
25 90°, P = 90°, and y = 90°, wherein the crystallized complex is produced by the 

process of preparing a first solution of about 5 mM D,L-glutamic acid, about 1 mM 
TCEP, about 200 mM ammonium acetate pH 7.4, about 10 mg/ml E. faecalis Murl; 
preparing a second solution of about 100 mM Tris pH 7.5, about 0.2 mM MgCl 2 , 
and about 20-25% PEG 4000; combining the first solution and the second solution, 
30 thereby producing a combination; combining the first solution and the second 

solution, thereby producing a combination; and forming drops from the combination 
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in a method of crystallization under conditions in which crystals of Murl are 
produced, whereby, crystals of Murl are produced. 

A further embodiment relates to an E.faecalis glutamate racemase in which 
the binding site comprises two conserved cysteine residues, denoted Cys74 and 
5 Cysl85 in the sequence represented as SEQ ID NO: 44. A further embodiment 

comprises a binding site of E. faecalis glutamate racemase wherein the binding site 
additionally comprises one or more of the following amino acid residues: Serl2, 
Thr76, Thrl 18, Thrl21, Glul53, Thrl86 and Hisl87 as represented by the structural 
coordinates of Figure 12. In one embodiment, the binding site of the E.faecalis 

1 0 glutamate racemase is complex ed with D- and/or L-glutamate and comprises amino 
acid residues Cys74 and Cysl85 as well as amino acid residues within 5 A of the 
Cys74 and Cysl85 residues as represented by the structural coordinates of Figure 
13. In one embodiment, the binding site of the E.faecalis glutamate racemase 
additionally comprises one or more of the following amino acid residues: Serl2, 

15 Thr76, Thrl 18, Thrl21, Glul53, Thrl86 andHisl87. In this embodiment, the 

binding site complexes with D- or L-glutamate and comprises amino acid residues 
Cys74 and Cysl85 and at least one (i.e., one or more) of the following amino acid 
residues: Serl2, Thr76, Thrl 18, Thrl21, Glul53, Thrl86 and Hisl87, which can be 
present in any combination. In a further embodiment, the binding site includes 

20 amino acid residues IlelO, Aspll, Serl2, Glyl3, Vall4, Glyl5, Glyl6, Thrl8, 

Vall9, Tyr34, Asp37, Arg40, Cys41, Pro42, Tyr43, Gly44, Pro45, Arg46, Val51, 
Glu53, Ile72, Ala73, Cys74, Asn75, Thr76, Ala77, Ser78, Ala79, Val96, Ilel 16, 
Glyll7, Thrl 18, Leull9, Glyl20, Thrl21, Uel22, Tyrl27, Cysl45, Prol46, 
Vall49, Prol50, Leul83, Glyl84, Cysl85, Thrl86, Hisl87, Tyrl88 and Ser208 of 

25 SEQ ID NO: 44. 

In a further embodiment, the binding site of the glutamate racemase 
comprises two hydrogen bond TRIADs, which occur close to the conserved cysteine 
residues of the binding site. Specifically, on one side (Cysl85) of the binding site, 
the TRIAD is Glul53-Thrl86-Hisl87 and on the other side (Cys74), the TRIAD is 

30 Thr76-Thrll8-Thrl2L Thus, in a specific embodiment of this invention, the 
binding site of E.faecalis glutamate racemase is complex ed with D- and/or L- 
glutamate and comprises amino acid residues Cys74 and Cysl85 and additionally 
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comprises at least one (one or more) of the following: amino acid residues Serl2, 
Thr76, Thrl 18, Thrl21, Glul53, Thrl86 and Hisl87. 

The threonine residues have features of interest as they relate to the binding 
site of the E.faecalis glutamate racemase. On one side (Cysl 85) of the binding site, 
5 Thrl 86 is H-bonded to Hisl87 and Hisl87 is further H-bonded to Glul53. The 
hydroxyl oxygen (O) of Thrl 86 is 2.9 A away from the amino nitrogen (N) of the 
substrate (glutamate) and 4.3 A from the sulfur (S) atom of Cysl 85, which is 4.1 A 
from the nitrogen (N) of Hisl87. 

On the other side of the binding site (Cys74), Thr76 is H-bonded to Thrl 18 

10 and further H-bonded to Thrl 21 . The hydroxyl O of Thr76 is H-bonded to one of 
the carboxylate oxygen atoms of the substrate and is 4.3 A away from the S atom of 
Cys74. Analysis showed that the three hydroxyl oxygens form a triangle, all within 
less than 3.4 A from one another. 

The two TRIADs may play important roles in altering the pKa of the two 

15 binding site cysteine residues (in addition to that of the neighboring hydrophobic 
core), facilitating the proton transfer during catalysis or both. 

Another embodiment of the present invention is a crystal of glutamate 
racemase, comprising a binding site comprising the amino acid residues Cys74, 
Cysl85, Serl2, Thr76, Thrl 18, Thrl21, Glul53, Thrl 86 and Hisl87 of the amino 

20 acid sequence represented by SEQ ID NO: 44. In one embodiment is a crystal of 
glutamate racemase complex ed with D- or L-glutamate. Additionally, the binding 
site of the crystal complexed with D-glutamate can comprise the amino acid 
residues: Cys74, Cysl 85, and one or more of the following amino acid residues: 
Serl2, Thr76, Thrl 18, Thrl21, Glul53, Thrl86 and Hisl87. A further embodiment 

25 is one in which the binding site additionally comprises at least one of the following 
amino acid residues: Aspl 1, Glyl3, Vall4, Glyl5, Glyl6, Arg40, Cys41, Pro42, 
Arg46, Ala73, Asn75, Ala77, Val96, Glyl 17, Vall49, Glyl84, and Tyrl88. A 
further embodiment is one in which the binding site additionally comprises at least 
one of the following amino acid residues: He 10, Thrl 8, Vail 9, Tyr34, Tyr43, 

30 Gly44, Pro45, Asp37, Val51, Ile72, Ser78, Ala79, Del 16, Leull9, Glyl20, Ilel22, 
Tyrl27, Cysl45, Prol46, Prol50, Leul83, and Ser208 of SEQ ID NO: 44. The 
crystals comprising binding sites represented by the above amino acid residues 
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comprise the amino acid sequence is that represented by SEQ ID NO: 44; or an 
amino acid sequence is at least 75%, 80%, 85%, 90%, 95%, or 98% homologous to 
the amino acid sequence that is represented by SEQ ID NO: 44. 

Location and geometry of a binding site of the enzyme are also defined. 
5 Two conserved cysteine (Cys) residues were identified as the residues responsible 
for the (de)protonation of the alpha-carbon of the substrate during catalysis, which is 
consistent with the two-base mechanism proposed for function of the enzyme in its 
role as a racemase. The two binding site cysteines are Cys74 and Cysl85, which are 
about 7.0 Angstroms (A) apart (Ca-Ca distance, which is the distance between Ca 

10 atoms). Other amino acid residues identified in the vicinity of the binding site 
include Serl2, Thr76, Thrll8, Thrl21, Glul53, Thrl86 and Hisl87. The bound 
substrate/product, D- or L-glutamate, is located between the two conserved cysteine 
residues. Further detail of the binding site is as follows: looking down the axis 
defined by the Glul53 along the y and a carbon, the two cysteines exhibit a rather 

15 symmetrical environment; each has a hydrophobic core behind, with respect to the 
substrate and a neighboring threonine residue not far from the substrate C-alpha (3.7 
- 3.4 A when L-Glu is bound; and 3.4 - 4.3 A when D-Glu is bound). Here, product 
is present in one site and a substrate in the other. There are conformational changes 
bringing the cysteines closer in the sub-unit with the substrate. 

20 Analysis of the crystal structure of the E. faecalis glutamate racemase 

indicates that the following amino acid residues are within 10 A of the bound D- or 
L-glutamate: 10-19, 34-46, 51, 72-78, 96-97, 116-122, 124, 127, 183-189, 208-209. 
Analysis of the crystal structure also shows that the following amino acid residues 
are within 4 A of the bound D- or L-glutamate: 11-12 41-44, 74-76, 118, 121, and 

25 1 86-1 87. Of interest is the fact that only one acidic amino acid residue (Aspl 1) is 
present in the structure surrounding the binding site Cys 74, which serves as an 
anchoring point for the amino group of the substrate/product with a polar (charged) 
interaction between the amino nitrogen atom of D- or L-glutamate and the delta 
oxygen atom of Aspl 1 . 

30 

2. E.faecium 
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The present invention provides a method of making a crystal of E. faecium 
Murl complexes. The method comprises preparing a first solution of from about 2 
to about 10 mM D,L-glutamate; from about 0.1 to about 10 mM reducing agent; 
from about 50 to about 500 mM salt pH 6.5-9.4; and about 10 mg/ml E. faecium 
5 Murl, wherein the glutamate, reducing agent, and salt are each present in sufficient 
concentration to bind to Murl, inhibit oxidation of the protein, and stabilize the 
protein and prevent aggregation; preparing a second solution of from about 50 to 
about 500 mM salt pH 4.5-9.0; wherein the salt is present in sufficient concentration 
to stabilize the protein, prevent aggregation and control the pH of the solution; 

10 combining the first solution and the second solution, thereby producing a 

combination; and forming drops from the combination under conditions in which 
crystals of Murl are produced. 

In one embodiment of the present invention, the substrate binding site of E. 
faecium Murl includes amino acid residues: Serl5, Glyl6, Vail 7, Glyl8, Pro45 ? 

15 Tyr46, Gly47, Cys77, Asn78, Thr79, Thrl21, Ilel22, Glyl23, Thrl24, Lysl50, 
Phel51, Vall52, Glyl87, Cysl88, Thrl89, and Hisl90 of SEQ ID NO: 48. 

3. S. aureus 

In one embodiment, the present invention is crystallized S. aureus Murl 
20 characterized by the structural coordinates depicted in Figure 14. 

In one embodiment, the present invention is crystallized S. aureus Murl 
complex ed with D-glutamate characterized by the structural coordinates depicted in 
Figure 15. 

One embodiment of the present invention is a crystal of S. aureus Murl 
25 complexed with the product (substrate) D-glutamate, wherein the Murl is at least 
70%, 75%, 80%, 85%, 90%, 95%, 98%, or 99% homologous to the amino acid 
sequence represented by SEQ ID NO: 46, or portions thereof. In the present 
invention, the crystals diffract from about 0.8 A to about 3.5 A. 

One embodiment of the S. aureus crystal complex is characterized as 
30 belonging to the orthorombic space group C2 with cell dimensions a = 96.43 A, b = 
88.87 A, c = 96.56 A, a = 90°, p = 109.00°, and y= 90°, wherein the crystallized 
complex is produced by the process of preparing a first solution of about 5 mM D,L- 
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glutamic acid; about 1 mM TCEP; about 200 mM ammonium acetate pH 7.4; and 
about 10 mg/ml S. aureus Murl; preparing a second solution of about 0.17 M 
ammonium sulfate, and about 25% PEG 9000; combining the first solution and the 
second solution, thereby producing a combination; and forming drops from the 
5 combination under conditions in which crystals of Murl are produced. 

A further embodiment relates to a S. aureus glutamate racemase in which the 
binding site comprises two conserved cysteine residues, denoted Cys72 and Cysl84 
in the sequence represented as SEQ ID NO: 46. A further embodiment comprises a 
binding site of S. aureus glutamate racemase wherein the binding site includes 

10 amino acid residues: SerlO, Pro40, Tyr41, Gly42, Cys72, Asn73, Thr74, Thrl 16, 
Thrl 19, Glul51, Cysl84, Thrl 85 and His 186 as represented by the structural 
coordinates of Figure 14. In one embodiment, the binding site of the 5. aureus 
glutamate racemase complexes with D-glutamate and comprises the relative 
structural coordinates of amino acid residues Cys72 and Cysl84 as well as amino 

15 acid residues within 5 A of the Cys72 and Cysl84 residues as represented by the 
structural coordinates of Figure 15. In one embodiment, the binding site of the S. 
aureus glutamate racemase additionally comprises at least one (one or more) of the 
following amino acid residues: SerlO, Pro40, Tyr41, Gly42, Asn73, Thr74, Thrl 16, 
Thrl 19, Glul51, Thrl 85 and His 186. In this embodiment, the binding site 

20 complexes with D-glutamate and comprises amino acid residues Cys72 and Cysl 84 
and at least one (one or more) of the following amino acid residues: SerlO, Pro40, 
Tyr41, Gly42, Asn73, Thr74, Thrl 16, Thrl 19, Glul51, Thrl85 and Hisl86, which 
can be present in any combination. 

In a further embodiment, the binding site of the glutamate racemase 

25 comprises two hydrogen bond TRIADs, which occur close to the conserved cysteine 
residues of the binding site. Specifically, on one side (Cysl 84) of the binding site, 
the TRIAD is Glyl51-Thrl85-Hisl86 and on the other side (Cys72), the TRIAD is 
Thr74-Thrl 16-Thrl 19. Thus, in a specific embodiment of this invention, the 
binding site of S. aureus glutamate racemase complexes with D-glutamate and 

30 comprises amino acid residues Cys72 and Cysl 84 and additionally comprises at 

least one (one or more) of the following: amino acid residues SerlO, Pro40, Tyr41, 
Gly42, Asn73, Thr74, Thrl 16, Thrl 19, Glul51, Thrl85 and Hisl86. 
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The threonine residues have features of interest as they relate to the binding 
site of the S. aureus glutamate racemase. On one side (Cysl84) of the binding site, 
Thrl85 isH-bondedtoHisl86andHisl86isfurtherH-bondedtoGlul51. The 
hydroxyl oxygen (O) of Thrl85 is 2.9 A away from the amino nitrogen (N) of the 
5 substrate (i.e., glutamate) and 4.4 A from the sulfur (S) atom of Cysl84, which is 
4.1 A from the nitrogen of Hisl86. 

On the other side of the binding site (Cys72), Thr74 is H-bonded to Thrl 16 
and further H-bonded to Thrl 19. The hydroxyl O of Thr74 is H-bonded to one of 
the carboxylate oxygen atoms of the substrate and is 4.3 A away from the S atom of 

10 Cys72. Analysis showed that the three hydroxyl oxygens form a triangle, all within 
less than 3.4 A from one another. 

The two TRIADs may play important roles in altering the pKa of the two 
binding site cysteine residues (in addition to that of the neighboring hydrophobic 
core), facilitating the proton transfer during catalysis or both. 

15 Another embodiment of the present invention is a crystal of S. aureus 

glutamate racemase, comprising a binding site comprising the amino acid residues 
Cys72, Cysl84, SerlO, Thr74, Thrl 16, Thrl 19, Glul51, Thrl85 and Hisl86 of the 
amino acid sequence represented by SEQ ID NO: 46. This crystal may be further 
complexed with D-glutamate. Additionally, the binding site of the crystal 

20 complexed with D-glutamate comprises the amino acid residues: Cys72, Cysl 84, 
and at least one of the following amino acid residues: SerlO, Pro40, Tyr41, Gly42, 
Asn73, Thr74, Thrl 16, Thrl 19, Glul51, Thrl85 and Hisl86. A further embodiment 
is one in which the binding site additionally comprises at least one of the following 
amino acid residues: Asp9, Glyl 1, Vall2, Glyl3, Glyl4, Arg38, Cys39, Pro43, 

25 Arg44, Ala71, Ala75, Val94, Glyl 15, Vall47, Glyl83 and Tyrl87. A further 

embodiment is one in which the binding site additionally comprises at least one of 
the following amino acid residues: Ile8, Thrl 6, Vail 7, Tyr32, Asp35, Val49, Ile70, 
Ser76, Ala77, Leull4, Glyl 18, Ilel20, Tyrl25, Prol44, Prol48, Leul82 and 
Ser207. The crystals comprising binding sites represented by the above amino acid 

30 residues comprise the amino acid sequence is that represented by SEQ ID NO: 46; 
or an amino acid sequence is at least 75%, 80%, 85%, 90%, 95%, 98%, or 99% 
homologous to the amino acid sequence that is represented by SEQ ID NO: 46. 
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Location and geometry of a binding site are of the enzyme are also defined. 
Two conserved cysteine (Cys) residues were identified as the residues responsible 
for the (de)protonation of the alpha-carbon of the substrate during catalysis, which is 
consistent with the two-base mechanism proposed for function of the enzyme in its 
5 role as a racemase. The two binding site cysteines are Cys72 and Cys 184, which are 
about 7.3 Angstroms (A) apart (Ccx-Cot distance, which is the distance between 
Ca atoms). Other amino acid residues identified in the vicinity of the binding site 
include SerlO, Thr74, Thrl 16, Thrl 19, Glul51, Thrl85 andHisl86. The bound 
substrate/product, D-glutamate, is located between the two conserved cysteine 

10 residues. Further detail of the binding site is as follows: looking down the axis 
defined by the GlulSl along the y and a carbon, the two cysteines exhibit a rather 
symmetrical environment; each has a hydrophobic core behind, with respect to the 
substrate and a neighboring threonine residue not far from the substrate C-alpha (3.4 
- 4.3 A), respectively. 

15 Analysis of the crystal structure of the S. aureus glutamate racemase 

indicates that the following amino acid residues are within 10 A of the bound D- 
glutamate: 8-17, 32-44, 49, 52, 71-79, 94-95, 1 14-120, 122, 125, 184-190, and 207- 
208. Analysis of the crystal structure also shows that the following amino acid 
residues are within 4 A of the bound D-glutamate: 9-10, 39-42, 72-74, 1 16, 119, and 

20 184-186. Of interest is the fact that only one acidic amino acid residue (Asp9) is 
present in the structure surrounding the binding site Cys72, which serves as an 
anchoring point for the amino group of the substrate/product with a polar (charged) 
interaction between the amino nitrogen atom of D-glutamate and the delta oxygen 
atom of Asp9. 

25 

C. Atypical - H. pylori 

H. pylori Murl (Murl) has been crystallized, and crystal structures (three- 
dimensional structure) determined. The structures determined represent that of Murl 
30 alone, or in complex with the enzyme substrate, glutamate. NMR data suggests that 
multiple folded conformations of Murl can exist in the absence of substrate. 
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Results show that the asymmetric unit of the crystal consists of two copies of 
the polypeptide chain corresponding to a dimer; the active form of H. pylori Murl 
(Murl) is a dimer. Each monomer consists of two distinct and very similar 
alpha/beta type domains that are held together by a hinge situated at one end of the 
5 domain interface (around amino acid residues 92 and 210). 

The two domains together form the active site that includes residues from 
only one of the monomers, despite the fact that the active site is situated close to the 
dimer interface. Both domains contribute important amino acid residues to the 
binding site. For example, the two conserved cysteines are from different domains. 

10 The H. pylori Murl dimer is held together by a stable and conserved 

hydrophobic core created by a four helix bundle (e.g., two helices from each 
monomer, residues 143 to 169) that links the two domains and, as a result, the two 
monomers, rather rigidly together. This arrangement leaves the other two domains 
of each monomer more free to move which gives access to the two active sites at the 

15 dimer interface. Results show that the protein is a dimer also in the absence of a 
ligand, and is more flexible in this state. 

The amino acid residues which make up the binding site of Murl are 
conserved in seventeen strains ofH. pylori that have been sequenced and the amino 
acid sequences determined. 

20 One of ordinary skill in the art would recognize that solving crystal 

structures of proteins such as H. pylori Murl (Murl) requires a stable, high-quality 
protein. 

One embodiment of the present invention relates to a crystal of Murl. In one 
embodiment, the present invention is crystallized H. pylori Murl characterized by 

25 the structural coordinates depicted in Figure 4. In one embodiment, the present 

invention is crystallized H. pylori Murl complexed with D-glutamate characterized 
by the structural coordinates depicted in Figure 5. One embodiment of the 
crystallized complex is characterized as belonging to the orthorombic space group 
P2i2i2i and has cell dimensions of a = 62.14 A, b = 81.07 A and c = 1 13.82 A, 

30 wherein a = 90°, (3 = 90°, and y = 90°. Another embodiment of the crystallized 
complex is characterized as belonging to the monoclinic space group P2i and has 
cell dimensions of a = 59.20 A, b = 82.40 A and c = 106.50 A, wherein a = 90°, p = 
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92.15°, and y = 90°. Another embodiment of the crystallized complex is 
characterized as belonging to the monoclinic space group P2i and has cell 
dimensions of a = 52.28 A, b = 78.96 A and c = 59.14 A, wherein a = 90°, P = 
92. 64°, and y = 90°. Another embodiment of the crystallized complex is 
5 characterized as belonging to the monoclinic space group P2i and has cell 

dimensions of a = 52.02 A, b = 80.66 A and c = 59.18 A, wherein a = 90°, p = 
92. 65°, and y = 90°. Another embodiment of the crystallized complex is 
characterized as belonging to the monoclinic space group P2i and has cell 
dimensions of a = 52.61 A, b = 78.40 A, and c = 59.43 A, and wherein a = 90°, P = 

10 92.33°, y = 90°. Another embodiment of the crystallized complex is characterized 
as belonging to the monoclinic space group P2j and has cell dimensions of a = 62.9 
A, b = 8 1 .8 A, and c = 1 1 3.6 A, and wherein a = 90°, p = 90°, y = 90°. Each of 
these embodiments is encompassed by the structural coordinates of Figure 4, or in 
complex with D-glutamate as encompassed by the structural coordinates of Figure 5. 

1 5 One embodiment of the present invention is a crystal of H. pylori Murl 

complexed with the product (substrate) D-glutamate wherein the Murl is at least 
70%, 75%, 80%, 85%, 90%, 95%, 98%, or 99% homologous to the amino acid 
sequence represented by SEQ ID NO: 2, or portions thereof. In the present 
invention, the crystals diffract from about 0.8 A to about 3.5 A. 

20 One embodiment of the present invention relates to a crystal of Murl in 

complex with an inhibitor (e.g., an antibacterial binding agent, drug, etc.). In one 
embodiment, the present invention relates to crystallized //. pylori Murl in complex 
with the substrate, D- glutamate. In a further embodiment of the present invention, 
the crystallized complex is characterized by the structural coordinates depicted in 

25 Figure 4, wherein the determined structures presented represent that of Murl in 
complex with D-glutamate. In one embodiment, the present invention discloses 
crystallized H. pylori Murl in complex with an inhibitor and substrate (e.g., 
glutamate). In a further embodiment of the present invention, the crystallized 
complex is characterized by the structural coordinates depicted in Figure 5, wherein 

30 the determined structures presented represent that of Murl in complex with an 

inhibitor and glutamate. In a further embodiment, the antibacterial binding agent is 
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a pyrimidinedione. In one embodiment of the present invention, the crystal complex 
is characterized by the structural coordinates depicted in Figure 4. 

A further embodiment of the crystallized complex is characterized as 
belonging to the orthorhombic space group P2]2i2i and has cell dimensions of a = 
5 61.41 A, b = 76.31 A and c = 108.92 A, wherein a = 90°, p = 90°, and y = 90°, 
wherein the crystallized complex is characterized by the structural coordinates 
depicted in Figure 5. 

A further embodiment is crystallized H. pylori Murl. In a further 
embodiment of the present invention, the crystallized Murl is characterized by the 

10 structural coordinates depicted in Figure 6, wherein the determined structures 

presented represent that of Murl in complex with glutamate and the pyrimidinedione 
inhibitor, compound A. 

A further embodiment is crystallized H. pylori Murl in complex with an 
inhibitor (e.g., an antibacterial binding agent, drug, etc.). In a further embodiment of 

15 the present invention, the crystallized complex is characterized by the structural 
coordinates depicted in Figure 7, wherein the determined structures presented 
represent that of Murl in complex with compound A. In a further embodiment, the 
inhibitor is a pyrimidinedione, such as an imidazolyl pyrimidinedione, a thiophenyl- 
pyrimidinedione, a furanyl-pyrimidinedione, a pyrazolo-pyrimidinedione, or a 

20 pyrrolyl pyrimidinedione. 

In a further embodiment, the pyrimidinedione is compound A, compound B, 
compound C, compound D, compound E, compound F, compound G, compound H, 
compound I, compound J, compound K, compound L, compound M, compound N, 
compound O, compound P, compound Q, compound R, compound S, compound T, 

25 compound U, compound V, compound W, compound X, compound Y, compound 
Z, compound AA, compound AB, compound AC, compound AD, compound AE, 
compound AF, compound AG, compound AH, compound AI, compound AJ, or 
compound AK, wherein the crystalline complexes are characterized by the space 
groups and cell dimensions depicted in Table 5. 

30 One further embodiment of the present invention relates to a crystal of H. 

pylori Murl complexed with a pyrimidinedione having the orthorhombic space 
group P2i2j2, and having cell dimensions a = 60.7 A, b = 77.5 A, c = 56.6 A, and a 
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= 90°, P = 90°, and y = 90°. A further embodiment encompasses this crystal made 
by the process of preparing a first solution of about 5 mM D-L Glutamate; about 1 
mM TCEP; about 200 mM ammonium acetate; about 0,1 M Tris pH 7.4-8.5; about 
500 micromolar of compound A; and about 10 mg/ml Murl from H. pylori; 
5 preparing a second solution of about 0.1 M Tris pH 7.4-8.5; about 20-25% PEG 

3350; about 15-25% glycerol; and about 0.2 mM ammonium acetate; combining the 
first solution and the second solution, thereby producing a combination; and forming 
drops from the combination under conditions in which crystals of Murl are 
produced. 

10 One further embodiment of the present invention relates to a crystal of H. 

pylori Murl complexed with a pyrimidinedione having the monoclinic space group 
P2 U and having cell dimensions a = 57.1 A, b = 78.0 A, c = 58.55 A, and a = 90°, P 
= 97.91°, and y = 90°. A further embodiment encompasses this crystal made by the 
process of preparing a first solution of about 5 mM D-L Glutamate; about 1 mM 

15 TCEP; about 200 mM ammonium acetate; about 0.1 M Tris pH 7.4-8.5; about 500 
micromolar of compound A; and about 10 mg/ml Murl from H. pylori; preparing a 
second solution with about 0.1 M Tris pH 7.4-8.5; about 20-25% PEG 3350; about 
15-25% glycerol; and about 0.2 mM ammonium acetate; combining the first solution 
and the second solution, thereby producing a combination; and forming drops from 

20 the combination in a method of crystallization under conditions in which crystals of 
Murl are produced, whereby, crystals of Murl are produced. 

III. Variants 

25 Variants of the present invention may have an amino acid sequence that is 

different by one or more amino acid substitutions from the sequence disclosed in 
SEQ ID NOS: 2 or 44, for example. Embodiments which comprise amino acid 
deletions and/or additions are also contemplated. The variant may have 
conservative changes (amino acid similarity), wherein a substituted amino acid has 

30 structural or chemical properties similar to those of the amino acid residue it 

replaces (e.g., the replacement of leucine with isoleucine). Guidance in determining 
which and how many amino acid residues may be substituted, inserted, or deleted 
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without abolishing biological or pharmacological activity may be reasonably 
inferred in view of this disclosure and may further be found using computer 
programs well known in the art, for example, DNAStar® software. 

Amino acid substitutions may be made, for instance, on the basis of 
5 similarity in polarity, charge, solubility, hydrophobicity, hydrophilicity, and/or the 
amphipathic nature of the residues as long as a biological and/or pharmacological 
activity of the native molecule is retained. 

Negatively charged amino acids include aspartic acid and glutamic acid; 
positively charged amino acids include lysine and arginine; amino acids with 
10 uncharged polar head groups having similar hydrophilicity values include leucine, 
isoleucine, and valine; amino acids with aliphatic head groups include glycine, 
alanine; asparagine, glutamine, serine; and amino acids with aromatic side chains 
include threonine, phenylalanine, and tyrosine. 

Example substitutions are set forth in Table 1 as follows: 
15 Table 1: 



Original Residue 


Example conservative substitutions 


Ala (A) 


Gly; Ser; Val; Leu; He; Pro 


Arg(R) 


Lys; His; Gin; Asn 


Asn (N) 


Gin; His; Lys; Arg 


Asp (D) 


Glu 


Cys (C) 


Ser 


Gin (Q) 


Asn 


Glu (E) 


Asp 


Gly(G) 


Ala; Pro 


His (H) 


Asn; Gin; Arg; Lys 


He (I) 


Leu; Val; Met; Ala; Phe 


Leu (L) 


He; Val; Met; Ala; Phe 


Lys (K) 


Arg; Gin; His; Asn 


Met (M) 


Leu; Tyr; lie; Phe 


Phe (F) 


Met; Leu; Tyr; Val; He; Ala 


Pro (P) 


Ala; Gly 
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Ser (S) 


Thr 


Thr (T) 


Ser 


Tip (W) 


Tyr; Phe 


Tyr(Y) 


Tip; Phe; Thr; Ser 


Val (V) 


He; Leu; Met; Phe; Ala 



In the present invention, "amino acid homology" is a measure of the identity 
of primary amino acid sequences. In order to characterize the homology, subject 
sequences are aligned so that the highest percentage homology (match) is obtained, 
5 after introducing gaps, if necessary, to achieve maximum percent homology. N- or 
C-terminal extensions shall not be construed as affecting homology. "Identity" per 
se has an art-recognized meaning and can be calculated using published techniques. 
Computer program methods to determine identity between two sequences include, 
for example, DNAStar® software (DNAStar Inc. Madison, WI); the GCG® 

10 program package (Devereux, J., et al Nucleic Acids Research (1984) 12(1): 387); 
BLASTP, BLASTN, FASTA (Atschul, S.F. et al, J. Molec Biol (1990) 215: 403). 
Homology (identity) as defined herein is determined using the well-known computer 
program, BESTFIT® (Wisconsin Sequence Analysis Package, Version 8 for Unix, 
Genetics Computer Group, University Research Park, 575 Science Drive, Madison, 

15 WI, 5371 1). When using BESTFIT® or any other sequence alignment program 
(such as the Clustal algorithm from MegAlign software (DNAStar®) to determine 
whether a particular sequence is, for example, about 90% homologous to a reference 
sequence, according to the present invention, the parameters are set such that the 
percentage of identity is calculated over the full length of the reference nucleotide 

20 sequence or amino acid sequence and that gaps in homology of up to about 90% of 
the total number of nucleotides in the reference sequence are allowed. 

Ninety percent homology is therefore determined, for example, using the 
BESTFIT® program with parameters set such that the percentage identity is 
calculated over the full length of the reference sequence, e.g., SEQ ID NOS: 2 or 44, 

25 and up to 10% of the amino acids in the reference sequence may be substituted with 
another amino acid. Percent homologies are likewise determined, for example, to 
identify preferred species, within the scope of the claims appended hereto, which 
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reside within the range of about 70% to 100% homology to SEQ ID NOS: 2 or 4, as 
well as the binding site thereof. As noted above, N- or C-terminal extensions shall 
not be construed as affecting homology. When comparing two sequences, the 
reference sequence is generally the shorter of the two sequences. This means that, 
5 for example, if a sequence of 50 nucleotides in length has exact identity to a 50 
nucleotide region within a 100 nucleotide polynucleotide, there is 100% homology 
for that region as opposed to only 50% homology. 

Although the naturally polypeptides of a sequence such as SEQ ID NO: 2, 
and a variant polypeptide may be only 90% identical, they are actually likely to have 

10 a higher degree of similarity, depending on the number of dissimilar codons that are 
conservative changes. Conservative amino acid substitutions can frequently be 
made in a protein without altering either the conformation or function of the protein. 
Similarity between two sequences includes direct matches, as well as conserved 
amino acid substitutions which have similar structural or chemical properties, e.g., 

1 5 similar charge as described in Table 1 . 

Percentage similarity (conservative substitutions) between two polypeptides 
may also be scored by comparing the amino acid sequences of the two polypeptides 
by using programs well known in the art, including the BESTFIT program, by 
employing default settings for determining similarity. 

20 A further embodiment of the invention is a crystal of Murl having an amino 

acid sequence represented by any one of SEQ ID NOS: 2-34, 40, 44, 46, 48, 50, 52, 
54, 56, 58, 60, 62, 64, 66, and 74. A further embodiment of the invention is a crystal 
of Murl, wherein the amino acid sequence is at least 75%, 80%, 85%, 90%, 95%, 
98%, or 99% homologous to the amino acid sequence represented by any one of 

25 SEQ ID NOS: 2-34, 40, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, and 74. 

Mutations/substitutions in the present invention can be made using 
techniques well-known in the art (e.g. site-directed mutagenesis) using molecular 
based methods. 

30 IV. Structural Homology and Conserved Structural Elements 
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Applicants have determined that Murl from Gram negative, Gram positive, 
and atypical bacterium have common structural elements that are constant even in 
view of primary amino acid sequences that differ with respect to their length of 
sequences and overall content. These common structural elements are depicted in 
5 Figures 1 and 2. As used herein, the term "structural homology" refers to conserved 
structural motifs which may vary with respect to the primary amino acid sequence 
up to 75%, but have similar or identical tertiary structures. 

Figure 1 A illustrates a monomer of Murl depicting conserved structural 
elements. The topology of the conserved structural elements is illustrated in Figure 

10 IB. One of ordinary skill in the art would recognize that all Murls share certain 
structural features as indicated by the column headed: "All". Specific structural 
features are conserved among atypical and Gram positive bacterium ("Murl — G-"), 
and others are conserved within the genus of Gram positive bacterium ("G+"). 
Subsets of the conserved structural elements, (e.g., strands and helices) are depicted 

15 by the abbreviations H (helices), E (strands), S (loops) and T (turns). Structural 

elements conserved across all Murl are indicated by a number in the column entitled 
"All Murl"; structural elements conserved in Murl of Gram positive and atypical 
bacterium are indicated by a number in the column headed "Murl - E. coif ; and 
structural elements conserved in Murl of Gram positive bacterium are indicated by a 

20 number in the column headed "G+". The number "1" indicates that the amino acid 
residue resides in domain 1 of Murl, whereas the number "2" indicates that the 
amino acid residue resides in domain 2 of Murl. 

As used herein, the term "accessory site" means a combination of any of the 
conserved structural elements, a subset of any of the conserved structural elements, 

25 or part or all of a binding site of the present invention. 

A. Gram negative 

Topology and conserved structural elements of E. coli Murl are illustrated in 
30 Figures 1 A and 2, respectively. One embodiment of the present invention are 

conserved structural elements of E. coli Murl represented by helices (H), beta sheets 
(E), loops (S), and turns (T). Exemplary examples of conserved structural elements 
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of Gram negative bacterium of the present invention are SI : 24V-28D; HI : 36V- 
43L; S2: 49Y-53F; H2: 65E-80Q; S3: 87L-90V; H3: 93A-98V; H4: 100L-105E; S4: 
111V-112V;H5: 118I-124L; S5: 130V-134A; H6: 136R-141R; H7: 143Y-150R; 
S6: 157I-161G;H8: 165M-170E; H9: 180L-186I; S7: 199T-202L; H10: 21 1Q- 
5 217V; S8: 223R-225V; Hll: 228G-238L; S9: 253I-256C; H12: 261P-271Q; and 
S10: 227T-279E. 

B. Gram positive 

10 1. E.faecalis 

Conserved structural elements of E.faecalis Murl are illustrated in Figures 
1 A, IB and 2. One embodiment of the present invention are conserved structural 
elements of E.faecalis Murl represented by helices (H), beta sheets (E), loops (S), 

15 and turns (T). Exemplary examples of conserved structural elements of the present 
invention are SI: 71-1 ID; HI: 19V-26Q; S2: 32L-36G; H2: 48A-63L; S3: 69M-72I; 
H3: 75N-80V; H4: 82L-87A; S4: 93V-94V; H5: 98L-107V; S5: 1 131-1 17G; H6: 
119L-124S;H7: 126S-133S; S6: 140V-144A; H8: 148F-153E; H9: 162A-169T; S7: 
180T-183L; H10: 192r-198v; S8: 204T-206I; Hll: 209G-219L; S9: 237E-240T; and 

20 H12: 245K-255L. 

2. E.faecium 

Conserved structural elements of E.faecium Murl are illustrated in Figures 
25 1 A, IB and 2. One embodiment of the present invention are conserved structural 
elements of E.faecium Murl represented by helices (H), beta sheets (E), loops (S), 
and turns (T). Exemplary examples of conserved structural elements of the present 
invention are SI: 10I-14D; HI: 22V-29Q; S2: 35I-39G; H2: 51A-66V; S3: 72M- 
751; H3: 78N-83V; H4: 85L-90A; S4: 96V-97I; H5: 101L-110A; S5: 116V-120G; 
30 H6: 122I-127S; H7: 129A-136E; S6: 143V-147A; H8: 151F-156E; H9: 166K-172T; 
S7: 183T-186L; H10: 195R-201V; S8: 207Q-209I; HI 1: 212G-222L; S9: 240Q- 
243T; HI 2: 248K-258L; and S10: 264E-266E. 
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3. 5. aureus 

Conserved structural elements of S. aureus Murl are illustrated in Figures 
5 1 A, IB and 2. One embodiment of the present invention are conserved structural 
elements of S. aureus Murl represented by helices (H), beta sheets (E), loops (S), 
and turns (T). Exemplary examples of conserved structural elements of the present 
invention are SI: 5I-9D; HI: 17V-24Q; S2: 30I-34C; H2: 46G-61M; S3: 67M-70I; 
H3: 73N-78V; H4: 80L-85K; S4: 91V-92I; H5: 96E-105T; S5: 111V-115G; H6: 
10 117E-122S;H7: 124A-131R; S6: 138V-142A; H8: 146F-151E; H9: 162S-168T; S7: 
179T-182L; H10: 191Y-197Y; S8: 203T-205I; and HI 1: 208G-218L. 

C. Atypical 

15 1. H. pylori 

Conserved structural elements of H. pylori Murl are illustrated in Figures 
1 A, IB and 2. One embodiment of the present invention are conserved structural 
elements of H. pylori Murl represented by helices (H), beta sheets (E), loops (S), 

20 and turns (T). Exemplary examples of conserved structural elements of the present 
invention are SI: 3I-7D; HI: 15V-22A; S2: 28I-32G; H2: 44P-59K; S3: 65L-68V; 
H3: 71N-76L; H4: 78L-83K; S4: 89I-90V; H5: 94E-103Q; S5: 111I-115G; H6: 
117K-122S; H7: 124A-131Q; S6: 137M41A; H8: 145F-150E; H9: 160E-166Y; S7: 
176V-179L; H10: 188A-194Y; S8: 206L-208F; Hll: 211G-221K; S9: 235E-238A; 

25 andH12:243I-253L. 

2 . A. pyrophilus 

Conserved structural elements of A. pyrophilus Murl are illustrated in 
30 Figures 1A, IB and 2. One embodiment of the present invention are conserved 

structural elements A. pyrophilus Murl represented by helices (H), beta sheets (E), 
loops (S), and turns (T). Exemplary examples of conserved structural elements of 
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the present invention are SI: 3I-7D; HI: 15V-22R; S2: 28I-32G; H2: 44K-59K; S3: 
65I-68V; H3: 71N-76Y; H4: 78L-83K; S4: 89V-90F; H5: 94E-103K; S5: 1091- 
113G; H6: 1 15P-120S; H7: 122A-129E; S6: 134V-138A; H8: 142F-147E; H9: 
157R-163Y; S7: 173T-176L; H10: 185K-191F; S8: 196E-198V; HI 1 : 201S-21 IF; 
5 S9: 221E-224F; H12: 229P-239L; and S10: 245V-247L. 

C. Common structural elements for Gram positive and atypical 

Murl 

10 Analysis of the atomic coordinates with reference to the 3-dimensional 

structure of Murl showed that Murl from Gram positive and atypical bacterium 
share conserved structural elements and fragments regardless of differences in 
primary amino acid sequence. The shared elements are illustrated below in Table 2. 
The thirteen shared elements of each conserved element are provided in terms of the 

1 5 primary amino acid sequence represented in Figure 2. 



Table 2 





1 


2 


3 


4 


5 


6 


7 


8 


9 


10 


11 


12 


13 


S. aureus 


4P- 


29T- 


60L- 


98G-v 


U0N- 


137E- 


162S- 


177S- 


191Y- 


202K- 


210E- 


234H- 


2451- 




24Q 


57A 


94V 


104M 


1291 


142 A 


166H 


183G 


199G 


206S 


2I7A 


240G 


253L 


E. faecalis 


6A- 


31 R- 


62 L- 


100G- 


112K- 


139E- 


163K- 


178L- 


192R- 


203V- 


211E- 


236H- 


247F- 




26Q 


59A 


96V 


106K 


1311 


144 A 


167 A 


184G 


200G 


207D 


218M 


242G 


255L 


E. faecium 


9P- 


34N- 


65L- 


103G- 


115Q- 


142T- 


166K- 


1811- 


195R- 


206V- 


214A- 


239C- 


250F- 




29Q 


62T 


99V 


109K 


134L 


147 A 


170 A 


187G 


203G 


210D 


221M 


245G 


258L 


H. pylori 


2K- 


27E- 


58F- 


96S- 


110P- 


136N- 


160E- 


174P- 


188A- 


205P- 


213A- 


234V- 


245 L- 




22A 


55L 


92V 


102R 


129L 


141A 


164H 


180G 


196M 


209H 


220Q 


240G 


253L 


A. 


2K- 


27D- 


58L- 


96G- 


108K- 


133D- 


157R- 


1711- 


185K- 


195A- 


203A- 


220L- 


231 L- 


pyrophilus 


22B 


55A 


92V 


102K 


127L 


138A 


161E 


177G 


193G 


199D 


2 10N 


226D 


239L 



20 D. Common structural elements for all Murl 

Analysis of the atomic coordinates with reference to the 3-dimensional 
structure of Murl showed that all Murl share conserved structural elements and 
fragments regardless of differences in primary amino acid sequence. The shared 
25 elements are illustrated below in Table 3. The thirteen shared elements of each 
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conserved element are provided in terms of the primary amino acid sequence 
represented in Figure 2. 



Table 3 





1 


2 


3 


4 


5 


6 


7 


8 


9 


10 


11 


12 


13 


E. coli 


23T- 


48H- 


81E- 


117A- 


122A- 


1291- 


156Q- 


180L- 


197P- 


211Q- 


222T- 


228G- 


252N- 




43L 


76V 


114V 


120R 


124L 


141R 


161G 


188R 


203G 


218L 


225V 


237W 


257M 


& aureus 


4P- 


29T- 


62E- 


98G- 


1031- 


110N- 


137E- 


162S- 


177S- 


191Y- 


202K- 


208G- 


234H- 




24Q 


57A 


94V 


101T 


105T 


122S 


142 A 


170K 


183G 


198F 


2051 


217A 


239T 


E. faecalis 


6A- 


31R- 


64K- 


100G- 


105 V- 


112K- 


139E- 


163K- 


178L- 


192R- 


203V- 


209G- 


236H- 




26Q 


59A 


96V 


103 A 


107 V 


124S 


144A 


171Q 


184G 


199M 


2061 


218M 


242T 


E. foeciwn 


9P- 


34N- 


67 E- 


103G- 


108 V- 


115Q- 


142T- 


166K- 


1811- 


195R- 


206V- 


212G- 


239C- 




29Q 


62T 


99V 


106A 


110K 


127S 


147 A 


174A 


187G 


202M 


2091 


221M 


244T 


H. pylori 


2K- 


27E- 


60P- 


96S- 


101K- 


110P- 


136N- 


160E- 


174P- 


188A- 


205 P- 


211G- 


234V- 




22A 


55L 


92V 


99A 


103Q 


122S 


141A 


168T 


180G 


195F 


2081 


220Q 


239S 


A. 


2K- 


27D- 


60D- 


96G- 


101L- 


108K- 


133D- 


157R- 


1711- 


185K- 


195A- 


201S- 


220L- 


pyrophilus 


22R 


55A 


92V 


99E 


103K 


120S 


138A 


165K 


177G 


192L 


198 V 


210N 


225T 



E. Common structural elements for Gram positive Murl 

Analysis of the atomic coordinates with reference to the 3 -dimensional 
structure of Murl showed that Murl from Gram positive bacterium share conserved 
structural elements and fragments regardless of differences in primary amino acid 
sequence. The shared elements are illustrated below in Table 4. The seven shared 
elements of each conserved element are provided in terms of the primary amino acid 
sequence represented in Figure 2. 



Table 4 





1 


2 


3 


4 


5 


6 


7 


S. aureus 


2N- 


146F- 


159T- 


177S- 


201 K- 


208G- 


234H- 




144P 


157D 


173R 


199G 


206S 


227S 


264V 


E. faecalis 


4Q- 


148F- 


160S- 


178L- 


202H- 


209G- 


236H- 




146P 


159S 


174Q 


203G 


207D 


228T 


267L 


E. faecium 


7N- 


151 F- 


163S- 


1811- 


205N- 


212G- 


239C- 




149P 


162S 


177T 


203G 


210D 


231S 


270L 
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V. Binding Sites 

Murl is highly flexible and depending on its conformation, can bind to a 
wide variety of structures. Murl from Gram positive bacterium, such as E.faecalis, 
5 E.faecium, and S. aureus, Gram negative bacterium, such as E. coli; and atypical 
bacterium, such as H. pylori, and A. pyrophilus, have been crystallized. 

A study of the structural coordinates and biochemistry has revealed a 
molecular interface along an axis of Murl that is highly flexible anci allows the 
enzyme to exist in multiple conformations which are determined by the substances 
1 0 that bind to the active site, intradomain interface, or intermolecular dimer interface, 
and in the case of Murl from Gram negative bacterium, the activator site. As such, 
the enzyme binds multiple substrates and inhibitors with a wide variety of structural 
features and sizes. 

Murl from Gram positive and atypical bacterium comprise a dimeric 
1 5 structure of two monomers, each of which is comprised of 2 domains. The 

"molecular interface" of Murl in Gram positive and atypical bacterium has three 
domains: a substrate binding site, an intermolecular dimer interface, and an 
intradomain interface. 

Murl from Gram negative bacterium is a monomer having three domains 
20 along the molecular interface: a substrate binding site, an activator binding site, and 
an intradomain interface. All Murl described herein have an "intradomain 
interface". When an inhibitor binds to the intradomain interface, it prevents 
movement of the enzyme, thereby inhibiting the enzyme. 

Murl from Gram positive bacterium are dimers, each monomer of the dimer 
25 having two domains. Gram positive bacteria have an intermolecular dimer interface 
in which the two substrate binding sites face outward on opposite ends of the 
enzyme, such that Murl can simultaneously binds two substrate moieties. 

Murl from atypical bacterium, such as H. pylori, are dimers having two 
monomers, each monomer having two domains, and each domain having a substrate 
30 binding site. The substrate binding site of A. pyrophilus Murl differs slightly in that 
the substrate is in contact with residues from each monomer, each containing key 
elements of the active site. 
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As used herein, the term "binding site" refers to a specific region (or atom) 
of Murl that enters into an interaction with a molecule that binds to Murl. A binding 
site can be, for example, a conserved structural element or a combination of several 
conserved structural elements, a substrate binding site, an activator binding site, an 
5 inhibitor binding site, an intradomain interface, or an intermolecular dimer interface. 

As used herein, the term "substrate binding site" refers to a specific region 
(or atom) of Murl that interacts with a substrate, such as D-glutamate. In the present 
application, the terms substrate binding site and active site are used interchangeably. 
A substrate binding site may comprise, or be defined by, the three dimensional 
1 0 arrangement of two or more amino acid residues within a folded polypeptide. 

In the present invention, a substrate can be a compound such as L-glutamate 
or D-glutamate which Murl reversibly converts from the R- to S-enantiomer. Thus, 
a substrate can also act a product. The substrate can be a naturally-occurring or 
artificial compound. 

1 5 In the present invention, an inhibitor can be a compound which also may 

undergo a catalytic reaction, bind to the substrate binding site, or another site on 
Murl and which competes with substrate turnover of glutamate. Inhibitors of the 
present invention can be a compound such as L-serine-O-sulfate, D-serine-O-sulfate, 
D-aspartate, L-aspartate, tartrate, citrate, phosphate, sulfate, aziridino-glutamate, N- 

20 hydroxyglutamate, or 3-chloroglutamate. The inhibitor can be a naturally-occurring 
or artificial compound. 

As used herein, the term "activator binding site" refers to a specific region 
(or atom) of Murl that interacts with an activator, such as UDP-MurNAc-Ala. An 
activator binding site may comprise, or be defined by, the three dimensional 

25 arrangement of one or more amino acid residues within a folded polypeptide. An 
activator can be a compound, such as UDP-MurNAc-Ala. The activator can be a 
naturally-occurring or an artificial compound. The structure of the activator is 
compact when it binds to Murl of Gram negative bacterium. When complexed with 
Murl, the activator (such as UDP-MurNAc-Ala) folds back on itself into a two- 

30 layered structure in which two phosphate groups act as a bridging link. The uridine 
ring of UDP-MurNAc-Ala stacks against the muramic acid ring. There is almost a 
perfect shape match between Murl and the activator because the activator resides on 
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top of two connecting loops between the two domains, corresponding in sequence to 
residues 112-116 and 225-228. In the absence of activator, the two loops act as a 
hinge between the two domains. The binding of Murl and the activator is not 
entirely driven by favorable van der Waal interactions; there are a number of polar 
5 interactions and at least one strong salt bridge between Argl04 and the carboxylate 
group of the alanine motif of the activator. The salt bridge could be a key 
interaction for UDP-MurNAc-Ala to be able to lock the two domains into the proper 
orientation. This explains the need for the terminal alanine residue in order for the 
molecule to work as an activator. Other extensive polar interactions are found 

10 between the uracil ring of UDP-MurN Ac-Ala and main chain atoms of Murl, 

residues 113-115, as well as a number of hydrogen bonds from the hydroxyl groups 
of the two sugar rings of UDP-MurN Ac-Ala. Many interactions between activator 
and Murl are mediated by water molecules. In contrast to the uracil and sugar 
moieties, two phosphate groups of UDP-MurN Ac- Ala make very little contact with 

1 5 the protein since they are facing towards the solution. On each side of the 

diphosphate group is a positively charged residue, Lysl 19 on one domain, and 
Arg233 from the other, which provides another example of domain-domain 
stabilizing interaction. 

One part of the molecular interface is an "intermolecular dimer interface" 

20 that is present only in Murl dimers (Gram positive and atypical bacterium) and 
occurs at the interface of the monomers that make up the dimer. Analysis of this 
intermolecular dimer interface of multiple Murls has revealed that the interface is 
highly flexible and also allows rotation of the monomers with respect to one another. 
The distance between the cot-ca active site cysteines present in the monomer and the 

25 extent of the angle of the opening can vary greatly (See Figure 1C, and Figure 3, 
column 3). They determine the structure of substrate or inhibitor that binds to 
different regions or components of the enzyme. Applicants have determined, for 
each species that has been crystallized, the amino acid residues present in the 
intermolecular dimer interface, and have compared the primary amino acid sequence 

30 across the species of Murl from Gram positive and atypical bacterium. Accordingly, 
Applicants have been able to determine the degree (in A) of flexibility of Murl when 
bound to several substrates or inhibitors such that it would be expected that Murl of 
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all Gram positive and atypical bacterium would be expected to behave in a similar, 
flexible manner (See Figure 3, column 7). Applicants have discovered several new 
inhibitors or substrates of Murl which bind to the active site. The active site of Murl 
is not static, and indeed, is highly flexible, making it possible for a variety of types 
5 of structures to bind to it and inhibit function of the enzyme. 

As shown in Figure 3, Applicants have calculated the distance in A between 
the active site cysteines of Murl when different compounds are bound to the 
substrate binding site. For example, the distance between cysteine residues of E. 
faecium can range from 7.5 A to 8.7 A, depending on whether tartrate, citrate, or 
10 phosphate is bound. 

A. Gram negative - E. coli 

The coordinates determined represent those of Murl alone, and Murl in 

15 complex with the substrate (L-glutamate), in complex with activator (UDP- 

MurNAc-Ala), and in complex with substrate (L-glutamate) and activator (UDP- 
MurNAc-Ala). Crystallization of E, coli Murl is described in Example 2 and 
Figures 8-11. Results show that the unit of the crystal consists of one molecule 
corresponding to a monomer; the native form of Murl from Gram negative bacteria 

20 is a monomer. The monomer has two domains, which both have similar alpha/beta 
type folds. The binding of the substrate clearly identifies the binding site that is 
situated between the two domains (See left side of Figure 1H). The activator, UDP- 
MurNAc-Ala, binds at the opposite side of the protein (See right side of Figure 1H), 
possibly acting as a modulator of activity by inducing the correct conformation of 

25 the binding site by modulation of the relative position of the two domains of the 
protein. Thus, the hinge region located between the two domains is flexible such 
that when activator is bound, the conformation of the protein changes at the hinge 
region to make the substrate binding site available. 

Gram negative bacteria include, for example, Escherichia species, 

30 Haemophilus influenzae, Klebsiella pneumoniae, Moraxella catarrhalis, Vibrio 
cholerae, Proteus mirablis, Pasteruella multocida, Acinetobacter baumanii, 
Bacteroides fragilis, Treponemal species, Borrelial species, Deinococcal species, 
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Pseudomonas species, Salmonella species, Shigella species, Yersinia species and 
Porphyromonas gingivalis. 

The crystal structure of the E. coli Murl of the present invention reveals the 
three dimensional structure of the binding domains formed by the atoms of the 
5 amino acid residues listed in Figure 10. 

1. Substrate Binding Site 

A further embodiment relates to a E. coli Murl in which the substrate binding 

10 site comprises two conserved cysteine residues, denoted Cys92 and Cys204 in the 
amino acid sequence represented as SEQ ID NO: 40. A further embodiment 
comprises a substrate binding site of E. coli Murl wherein the binding site 
additionally comprises one or more of the following amino acid residues: Ser29, 
Thr94, Thrl35, Thrl38, Glul70, Thr205 and His206 as represented by the structural 

15 coordinates of Figure 10. In one embodiment, the substrate binding site of the E. 
coli Murl complexes with L-glutamate and comprises amino acid residues Cys92 
and Cys204 as well as amino acid residues within 5 A of the Cys92 and Cys204 
residues as represented by the structural coordinates of Figure 8. In one 
embodiment, the substrate binding site of the E. coli Murl additionally comprises 

20 one or more of the following amino acid residues: Ser29, Thr94, Thrl35, Thrl38, 
Glul70, Thr205 and His206. In this embodiment, the substrate binding site 
complexes with L-glutamate and comprises amino acid residues Cys92 and Cys204 
and at least one (i.e., one or more) of the following amino acid residues: Ser29, 
Thr94, Thrl35, Thrl38, Glul70, Thr205 and His206, which can be present in any 

25 combination. In a further embodiment, the substrate binding site includes amino 
acid residues Phe27, Asp28, Ser29, Gly30, Val31, Gly32, Gly33, Ser35, Val36, 
Asp54, Ala57, Ala57, Phe58, Pro59, Tyr60, Gly61, Glu62, Lys63, Ile68, Val90, 
Ala91, Cys92, Asn93, Thr94, Ala95, Ser96, Thr97, Vail 14, Vail 15, Leul33, 
Alal34, Thrl35, Argl36, Glyl37, Thrl38, Vall39, Thrl44, Alal63, Vall66, 

30 Glul67, Glul70, Leu202, Gly203, Cys204, Thr205, His206, Phe207 and Ser227. 

In a further embodiment, the substrate binding site of the Murl comprises 
two hydrogen bond TRIADs, which occur close to the conserved cysteine residues 
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of the binding site. Specifically, on one side (Cys204) of the binding site, the 
TRIAD is Glul70-Thr205-His206 and on the other side (Cys92), the TRIAD is 
Thr94-Thrl35-Thrl38. Thus, in a specific embodiment of this invention, the 
substrate binding site of E. coli Murl complexes with L-glutamate and comprises the 
5 relative structural coordinates of amino acid residues Cys92 and Cys204 and 

additionally comprises the relative structural coordinates of at least one (i.e., one or 
more) of the following: amino acid residues Ser29, Thr94, Thrl35, Thrl38, Glul70, 
Thr205 and His206. 

The threonine residues have features of interest as they relate to the binding 

10 site of the E. coli Murl. On one side (Cys204) of the substrate binding site, Thr205 
is H-bonded to His206 and His206 is further H-bonded to Glul70. The hydroxyl 
oxygen (O) of Thr205 is 2.8 A away from the amino nitrogen (N) of the substrate 
(i.e., glutamate) and 5.3 A from the sulfur (S) atom of Cys204, which is 4.3 A from 
His206. On the other side of the substrate binding site (Cys92), Thr94 is H-bonded 

15 to Thrl35 and further H-bonded to Thrl38. The hydroxyl O of Thr94 is H-bonded 
to one of the carboxylate oxygen atoms of the substrate and is 3.3 A away from the 
S atom of Cys92. Analysis showed that the three hydroxyl oxygens form a triangle, 
all within less than 3.2 A from one another. 

The two TRIADs may play important roles in altering the pKa of the two 

20 substrate binding site cysteine residues (in addition to that of the neighboring 
hydrophobic core), facilitating the proton transfer during catalysis or both. 

Another embodiment of the present invention is a crystal of is. coli Murl, 
comprising a substrate binding site comprising the amino acid residues Cys92, 
Cys204, Ser29, Thr94, Thrl35, Thrl38, Glul70, Thr205 and His206. This crystal 

25 may also comprise Murl complexed with L-glutamate. Additionally, the substrate 
binding site of the crystal complexed with L-glutamate can comprise the amino acid 
residues: Cys92, Cys204, and one or more of the following amino acid residues: 
Ser29, Thr94, Thrl35, Thrl38, Glul70, Thr205 and His206. A further embodiment 
is one in which the substrate binding site additionally comprises at least one of the 

30 following amino acid residues: Asp28, Gly30, Vail 31, Gly32, Gly33, Ala57, 

Phe58, Pro59, Tyr60, Gly61, Glu62, Lys63, Ala91, Asn93, Ala95, Vail 14, Alal34, 
Vail 66, Gly203 and Phe207. A further embodiment is one in which the substrate 
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binding site additionally comprises at least one of the following amino acid residues: 
Phe27, Ser35, Val36, Tyr51, Asp54, Ile68, Val90, Ser96, Thr97, Vail 15, Leul33, 
Thrl35, Argl36, Glyl37, Vail 39, Thrl44, Alal63, Glul67, Leu202, and Ser227. 
The crystals comprising substrate binding sites encompassed by the above amino 
5 acid residues comprise the amino acid sequence is that represented by SEQ ID NO: 
40; or an amino acid sequence is at least 75%, 80%, 85%, 90%, 95%, 98%, or 99% 
homologous to the amino acid sequence that is represented by SEQ ID NO: 40. 

Location and geometry of the substrate binding site of the enzyme are also 
defined. Two conserved cysteine (Cys) residues are identified as the residues 

10 responsible for the (de)protonation of the alpha-carbon of the substrate during 

catalysis, which is consistent with the two-base mechanism proposed for function of 
the enzyme in its role as a racemase. The two substrate binding site cysteines are 
Cys92 and Cys204, which are about 7.6 Angstroms (A) apart (Ca-Ca distance, i.e., 
the distance between Ca atoms). Other amino acid residues identified in the vicinity 

15 of the substrate binding site include Ser29, Thr94, Thrl35, Thrl38, Glul70, Thr205 
and His206. The bound substrate/product, L-glutamate, is located between the two 
conserved cysteine residues. Further detail of the binding site is as follows: looking 
down the axis defined by the Glul70 along the y and a carbon, the two cysteines 
exhibit a rather symmetrical environment; each has a hydrophobic core with respect 

20 to the substrate and a neighboring threonine residue not far from the substrate C- 
alpha (4.5 - 3.4 A), respectively. 

Analysis of the crystal structure of the E. coli Murl indicates that the 
following amino acid residues are within 10 A of the bound L-glutamate: 27-36, 51- 
63, 69, 52, 90-98, 227-228, 205-208, 133-139, 141, 114-118, 144, 58, 60, 61, and 

25 65. Analysis of the crystal structure also shows that the following amino acid 

residues are within 4 A of the bound L-glutamate: 28-29, 58-61, 92-94, 135, 138, 
and 204-206. Of interest is the fact that only one acidic amino acid residue (Asp28) 
is present in the structure surrounding the substrate binding site Cys92, which serves 
as an anchoring point for the amino group of the substrate/product with a polar 

30 (charged) interaction between the amino nitrogen atom of L-glutamate and the delta 
oxygen atom of Asp28. 
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2. Activator Binding Site 

One embodiment of the present invention relates to an activator binding site 
5 of E, coli Murl. As used herein, the term "activator binding site" refers to a specific 
region (or atom) of Murl that interacts with an activator, such as UDP-MurN Ac-Ala. 
In certain embodiments, a binding site may comprise, or be defined by, the three 
dimensional arrangement of one or more amino acid residues within a folded 
polypeptide. 

10 A further embodiment relates to E. coli Murl in which the activator binding 

site comprises amino acid residues Argl04, Glyll3, Vail 14, Vail 15, Proll6, 
Lysl 19, Ser227, Ala230, Ile231, Arg233, and Arg234 of SEQ ID NO: 40. A further 
embodiment comprises an activator binding site of E. coli Murl wherein the binding 
site additionally comprises at least one (i.e., one or more) of the following amino 

15 acid residues: LeulOO, Vail 12, Alal 17 and Prol20 as represented by the structural 
coordinates of Figure 8. In a further embodiment, the activator binding site 
comprises at least one (i.e., one or more) of the following amino acid residues: 
Thr97, ProlOl, Alal02, Glul05, Lysl06, Phel07, Aspl08, Phel09, ProllO, Vail 11, 
Glyl 13, Ilell8, Argl23, Leul24, Serl42, Tyrl43, Thrl44, Glul46, Leul47, 

20 Argl50, Phel51, Asp226, Gly228, Ala229, Trp237, Leu238, Glu240 and His241. 

3. Intradomain Binding Site 

In one embodiment of the present invention, E. coli Murl has an intradomain 
25 dimer interface in which the two domains of the monomer interact. The intradomain 
dimer interface of E. coli Murl comprises amino acid residues: Asp28, Ser29, 
Gly30, Val31, Gly32, Gly33, Leu34, Ser35, Val36, Asp38, Glu39, His42, Leu43, 
Val56, Ala57, Phe58, Pro59, Tyr60, Gly61, Glu62, Lys63, Ser64, Glu65, Ala66, 
Phe67, Ile68, Ala91, Cys92, Asn93, Thr94, Ala95, Ser96, Thr97, Val98, Vail 14, 
30 Ala230, Ile231, Ala232, Arg233, Arg234, Trp237, Leu238, Pro261, Gly262, 

Gln265, Leu266, Pro268, Val269, Leu270, Arg272, and Tyr273 of a first domain, 
and amino acid residues Vail 15, Pro 1 16, Alal 17, Ilel 18, Lysl 19, Pro 120, Alal21, 
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Argl23, Leul24, Thrl35, Argl36, Glyl37, Thrl38, Vail 39, Lysl40, Argl41, 
Serl42, Tyrl43, Thrl44, Glul46, Leul47, Argl50, Serl62, Alal63, Vall66, 
Glyl67, Alal69, Glul70, Alal71, Lysl72, Leul73, Hisl74, Val201, Leu202, 
Gly203, Cys204, Thr205, His206, Phe207, Pro208, Leu209, Leu210, Val225, 
5 Asp226, Ser227, Gly228, and Leu229 of a second domain, in which the amino acid 
residues are represented by SEQ ID NO: 40. 

B. Gram positive 

10 

As described in the examples that follow, Murl of E.faecalis, S. aureus, and 
E. faecium have been crystallized and the crystal structure (three-dimensional 
structure) of each determined. The structures determined represent that of Murl 
alone, in complex with the enzyme product (substrate), such as D-glutamate or L- 

15 glutamate, or in complex with an inhibitor. Crystallization of E.faecalis, E. 
faecium, and S. aureus Murl is described in Examples 3-5, and Figures 12-18. 
Results show that the asymmetric unit of the crystal consists of a dimer which can 
exists in symmetrical (see Figure 1G) or non-symmetrical forms (see Figure 1H), 
depending on whether one or both of the substrate binding sites are occupied or 

20 open. The Murl protein is a four-domain structure in terms of overall folding; each 
domain has folds of the alpha/beta type. The molecular interface of Gram positive 
bacterium exists between two of the domains and functions as a flexible element by 
which a change in conformation opens the substrate binding site, allowing substrate 
to bind Murl. 

25 Gram positive bacteria include the bacteria of Bacillus species, 

Staphylococcal spp., Streptococcal species, Enterococcal species, Lactobacilli, 
Pediococci, and Mycobacterial species. More specifically, Gram positive bacteria 
include, for example, B. subtilis, S. aureus, E.faecalis, and E. faecium. 

30 1. E. faecal is 

a. Substrate Binding Site 
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A further embodiment relates to an E.faecalis Murl in which the binding site 
comprises two conserved cysteine residues, denoted Cys74 and Cysl85 in the 
sequence represented as SEQ ID NO: 44. A further embodiment comprises a 
binding site of E.faecalis Murl wherein the binding site additionally comprises one 
5 or more of the following amino acid residues: Serl2 5 Thr76, Thrl 18, Thrl21, 

Glul53, Thrl 86 and Hisl87 as represented by the structural coordinates of Figure 
12. In one embodiment, the binding site of the E.faecalis Murl complexes with D- 
or L-glutamate and comprises amino acid residues Cys74 and Cysl85, as well as 
amino acid residues within 5 A of the Cys74 and Cysl85 residues as represented by 

10 the structural coordinates of Figure 13. In one embodiment, the binding site of the 
E.faecalis Murl additionally comprises one or more of the following amino acid 
residues: Serl2, Thr76, Thrl 18, Thrl21, Glul53, Thrl 86 and Hisl87. In this 
embodiment, the binding site complexes with D- or L-glutamate and comprises the 
amino acid residues Cys74 and Cysl 85 and at least one (one or more) of the 

15 following amino acid residues: Serl2, Thr76, Thrl 18, Thrl21, Glul53, Thrl86 and 
Hisl 87, which can be present in any combination. In a further embodiment, the 
binding site includes amino acid residues UelO, Aspll, Serl2, Glyl3, Vall4, Glyl5, 
Glyl6, Thrl 8, Vall9, Tyr34, Asp37, Arg40, Cys41, Pro42, Tyr43, Gly44, Pro45, 
Arg46, Val51, Glu53, Ile72, Ala73, Cys74, Asn75, Thr76, Ala77, Ser78, Ala79, 

20 Val96, Ilell6, Glyll7, Thrl 18, Leull9, Glyl20, Thrl21, Ilel22, Tyrl27, Cysl45, 
Prol46, Vall49, Prol50, Leul83, Glyl84, Cysl85, Thrl86, Hisl87, Tyrl88 and 
Ser208. 

In a further embodiment, the binding site of the Murl comprises two 
hydrogen bond TRIADs, which occur close to the conserved cysteine residues of the 

25 binding site. Specifically, on one side (Cysl 85) of the binding site, the TRIAD is 

Glul53-Thrl86-Hisl87 and on the other side (Cys74), the TRIAD is Thr76-Thrl 18- 
Thrl21 . Thus, in a specific embodiment of this invention, the binding site of E. 
faecalis Murl is complexed with D- or L-glutamate and comprises amino acid 
residues Cys74 and Cysl 85 and additionally comprises at least one (i.e., one or 

30 more) of the following: amino acid residues Serl2, Thr76, Thrl 18, Thrl21, Glul53, 
Thrl 86 and Hisl 87. 
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The threonine residues have features of interest as they relate to the binding 
site of E. faecalis MurL On one side (Cysl 85) of the binding site, Thrl 86 is H- 
bonded to Hisl87 and Hisl87 is further H-bonded to Glul53. The hydroxyl oxygen 
(O) ofThrl86 is 2.9 A away from the amino nitrogen (N) of the substrate (i.e., 
5 glutamate) and 4.3 A from the sulfur (S) atom of Cysl 85, which is 4.1 A from the 
nitrogen (N) ofHisl87. 

On the other side of the binding site (Cys74), Thr76 is H-bonded to Thrl 18 
and further H-bonded to Thrl21. The hydroxyl O of Thr76 is H-bonded to one of 
the carboxylate oxygen atoms of the substrate and is 4.3 A away from the S atom of 

10 Cys74. Analysis showed that the three hydroxyl oxygens form a triangle, all within 
less than 3.4 A from one another. 

The two TRIADs may play important roles in altering the pKa of the two 
binding site cysteine residues (in addition to that of the neighboring hydrophobic 
core), facilitating the proton transfer during catalysis or both. 

15 Another embodiment of the present invention is a crystal of Murl, 

comprising a binding site comprising the amino acid residues Cys74, Cysl85, Serl2, 
Thr76, Thrl 18, Thrl21, Glul53, Thrl86 and Hisl87 of the amino acid sequence 
represented by SEQ ED NO: 44. This crystal may be further complexed with D- or 
L-glutamate. Additionally, the binding site of the crystal complexed with D- 

20 glutamate can comprise the amino acid residues: Cys74, Cysl 85, and one or more of 
the following amino acid residues: Serl2, Thr76, Thrl 18, Thrl21, Glul53, Thrl86 
and His 1 87. A further embodiment is one in which the binding site additionally 
comprises at least one of the following amino acid residues: Aspl 1, Glyl3, Vall4, 
Glyl5, Glyl6, Arg40 5 Cys41, Pro42, Arg46, Ala73, Asn75, Ala77, Val96, Glyl 17, 

25 Vail 49, Glyl 84, and Tyrl88. A further embodiment is one in which the binding 

site additionally comprises at least one of the following amino acid residues: IlelO, 
Thrl 8, Vail 9, Tyr34, Asp37, Tyr43, Gly44, Pro45, Val51, Ile72, Ser78, Ala79, 
Ilel 16, Leul 19, Glyl20, Ilel22, Tyrl27, Cysl45, Prol46, Prol50, Leul83, and 
Ser208. The crystals comprising binding sites represented by the above amino acid 

30 residues comprise the amino acid sequence is that represented by SEQ ID NO: 44; 
or an amino acid sequence is at least 75%, 80%, 85%, 90%, 95%, or 98% 
homologous to the amino acid sequence that is represented by SEQ ID NO: 44. 
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Location and geometry of a binding site of the enzyme are also defined. 
Two conserved cysteine (Cys) residues were identified as the residues responsible 
for the (de)protonation of the alpha-carbon of the substrate during catalysis, which is 
consistent with the two-base mechanism proposed for function of the enzyme in its 
5 role as a racemase. The two binding site cysteines are Cys74 and Cysl 85, which are 
about 7.0 Angstroms (A) apart (Ca-Ca distance, i.e., the distance between Ca 
atoms). Other amino acid residues identified in the vicinity of the binding site 
include Serl2, Thr76, Thrll8, Thrl21, Glul53, Thrl86 and Hisl87. The bound 
substrate/product, D- or L-glutamate, is located between the two conserved cysteine 

10 residues. Further detail of the binding site is as follows: looking down the axis 
defined by the Glul53 along the y and a carbon, the two cysteines exhibit a rather 
symmetrical environment; each has a hydrophobic core behind, with respect to the 
substrate and a neighboring threonine residue not far from the substrate C-alpha (3.7 
- 3.4 A when L-Glu is bound; and 3.4 - 4.3 A when D-Glu is bound), Here we have 

1 5 product in one site and a substrate in the other. Conformational changes bring the 
cysteines closer in the a sub-unit with the substrate), respectively. 

Analysis of the crystal structure of the E.faecalis Murl indicates that the 
following amino acid residues are within 10 A of the bound D- or L-glutamate: 10- 
19, 34-46, 51, 72-78, 96-97, 116-122, 124, 127, 183-189, and 208-209. Analysis of 

20 the crystal structure also shows that the following amino acid residues are within 4 
A ofthe bound D- or L-glutamate: 11-12 41-44,74-76, 118, 121, and 186-187. Of 
interest is the fact that only one acidic amino acid residue (Aspl 1) is present in the 
structure surrounding the binding site Cys74, which serves as an anchoring point for 
the amino group of the substrate/product with a polar (charged) interaction between 

25 the amino nitrogen atom of D- or L-glutamate and the delta oxygen atom of Aspl 1. 



b. Intermolecular Dimer Interface 

One embodiment ofthe present invention is a crystal of E.faecalis Murl 
30 having an intermolecular interface comprising amino acid residues: Gln26, Leu27, 
Pro28, Asn29, Glu83, Lys86, Ala87, Ala88, Leu89, Pro90, Ile91, Pro92, Val93, 
Val94, Gly95, Val96, Ile97, Leu98, Pro99, Argl02, Alal03, Lysl06, Alal30, 
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Serl33, Lysl34, Alal35, Prol36, Ala210, Glu21 1, Gly214, Glu215, Glu216, 
Ser217, Met218, Leu219, Asp221, Tyr222, Phe223, Asp224, Ile225, Ala226, 
His227, Thr228, and Pro229 of SEQ ID NO: 44. 

5 c. Intradomain Binding Site 

One embodiment of the present invention is crystallized E. faecalis Murl 
having an intradomain dimer interface comprising amino acid residues: Aspl 1, 
Serl2, Glyl3, Vail 4, Glyl5 5 Glyl6, Leul7, Thrl8, Vall9, Lys21, Glu22, Lys25, 

10 Ala39, Arg40, Cys41, Pro42, Tyr43, Gly44, Pro45, Arg46, Pro47, Val51, Ala73, 
Cys74, Asn75, Thr76, Ala79, Val80, Val96, Ile97, Glu21 1, Thr212, Gly214, 
Glu215, Met218, Leu219, Asp221, and Tyr222 of a first domain, and amino acid 
residues Ile97, Leu98, Pro99, GlylOO, AlalOl, Argl02, Alal03, Alal04, Vall05, 
Lysl06, Vall07, Thrll8, Leull9, Glyl20, Thrl21, Lysl23, Serl24, Alal25, 

15 Serl26, Tyrl27, Ilel29, Alal30, Serl33, Lysl34, Cysl45, Prol46, Lysl47, Phel48, 
Vall49, Prol50, Ilel51, Vall52, Glul53, Serl54, Asnl55, Ilel82, Leul83, Glyl84, 
Cysl85, Thrl86, Hisl87, Tyrl88, Prol89, Leul90, Ile206, Asp207, Ser208, 
Gly209, and Ala210 of a second domain, in which the amino acid residues are 
represented by SEQ ID NO: 44. 

20 

2. E.faecium 

a. Substrate Binding Site 

25 A further embodiment relates to a E.faecium Murl in which the substrate 

binding site comprises two conserved cysteine residues, denoted Cys77and Cysl88 
in the amino acid sequence represented as SEQ ID NO: 48. A further embodiment 
comprises a substrate binding site of E. faecium Murl wherein the binding site 
additionally comprises one or more of the following amino acid residues: Serl5, 

30 Thr79, Thrl21, Thrl24, Glul56, Thrl89, and Hisl90 as represented by the 

structural coordinates of Figure 16. In one embodiment, the substrate binding site of 
the E.faecium Murl includes amino acid residues within 5 A of the Cys77and 
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Cysl88 residues as represented by the structural coordinates of Figure 16. In one 
embodiment, the substrate binding site of the E. faecium Murl additionally 
comprises one or more of the following amino acid residues: SerlS, Thr79, Thrl21, 
Thrl24, Glul56, Thrl89, and His 190. In this embodiment, the substrate binding site 
5 complexes with L-glutamate and comprises amino acid residues Cys77 and Cysl88 
and at least one (i.e., one or more) of the following amino acid residues: SerlS, 
Thr79, Thrl21, Thrl24, Glul56, Thrl89, and Hisl90, which can be present in any 
combination. In a further embodiment, the substrate binding site includes amino 
acid residues Ilel3, Asp 14, SerlS, Glyl6, Vail 7, Glyl8, Glyl9, Thr21, Val22, 

10 Asp40, Arg43, Cys44, Pro45, Tyr46, Gly47, Phe48, Arg49, Val54, Met72, Ala76, 
Cys77, Asn78, Thr79, Ala80, Thr81, Ala82, Val99, IlelOO, Ilell9, Glyl20, Thrl21, 
Ilel22, Glyl23, Thrl24, Vall25, Tyrl30, Phel49, Vall52, Serl53, Glul56, Leul86, 
Glyl87, Cysl88,Thrl89, Hisl90, Tyrl91 andSer211 of SEQ ID NO: 48. 

In a further embodiment, the substrate binding site of the Murl comprises 

15 two hydrogen bond TRIADs, which occur close to the conserved cysteine residues 
of the binding site. Specifically, on one side (Cysl88) of the binding site, the 
TRIAD is Glul56-Thrl89-Hisl90 and on the other side (Cys77), the TRIAD is 
Thr79-Thrl21-Thrl24. Thus, in a specific embodiment of this invention, the 
substrate binding site of E. faecium Murl complexes with L-glutamate and 

20 comprises amino acid residues Cys77 and Cysl88 and additionally includes least 
one (i.e., one or more) of the following: amino acid residues SerlS, Thr79, Thrl21, 
Thrl24, Glul56, Thrl89, and Hisl90. 

The threonine residues have features of interest as they relate to the binding 
site of the E. faecium Murl. On one side (Cysl 88) of the substrate binding site, 

25 Thrl89 is H-bonded to Hisl90 and Hisl90 is further H-bonded to Glul56. On the 
other side of the substrate binding site (Cys77), Thr79 is H-bonded to Thrl21 and 
further H-bonded to Thrl24. The two TRIADs may play important roles in altering 
the pKa of the two substrate binding site cysteine residues (in addition to that of the 
neighboring hydrophobic core), facilitating the proton transfer during catalysis or 

30 both. 



b. Intermolecular Dimer Interface 
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One embodiment of the present invention is crystallized E.faecium Murl 
having an intermolecular interface comprising amino acid residues: Gln29, Leu30, 
Pro31, Asn32, Glu86, Lys89, Ala90, Ala91, Leu92, Ser93, Ile94, Pro95, Val96, 
5 Ile97, Gly98, Val99, IlelOO, LeulOl, Prol02, Argl05, Lysl09, Glul36, Lysl37, 
Vall38, Prol39 5 Glu214, Gly217, Glu218, Ser220, Met221, Leu222, Asp224, 
Tyr225, Phe226 5 Asn230, Ser231, and Pro232 of SEQ JD NO: 48. 

c. Intradomain Binding Site 

10 

One embodiment of the present invention is crystallized E.faecium Murl 
having an intradomain dimer interface comprising amino acid residues: Asp 14, 
Serl5, Glyl6, Vall7, Glyl8 ? Glyl9, Leu20, Thr21, Val22, Glu25, Lys28, Gln29, 
Arg43, Cys44 5 Pro45, Tyr46 ? Gly47, Pro48, Arg49, Pro50, Ala51, Val54, Ala76, 

15 Cys77, Asn78, Thr79, Ala82, Val83, Val99, Glu214, Thr215 ? Val216, Gly217, 

Glu218> Met221 5 Leu222, Leu249, Phe250 ? Glu252, Ile253 ? Asp256 ? and Trp257 of 
a first domain, and amino acid residues IlelOO, LeulOl, Prol02, Glyl03, Thrl04, 
Argl05, Alal06, Alal07, Vall08, Argl09, Lysl 10, Thrl21, Ilel22, Glyl23, 
Thrl24, Serl27, Glnl28, Alal29, Tyrl30, Leul32, Alal33, Leul34, Glyl36, 

20 Lysl37, Prol49, Lysl50, PhelSl, Vall52, Vall55, Glul56, Serl57, Asnl58, 

Ilel85, Leul86, Glyl87, Cysl88, Thrl89, Hisl90, Tyrl91, Prol92, Leul93, Ile209, 
Asp210, Ser21 1, Gly212, and Ala213 of a second domain, wherein the amino acid 
residues are represented by SEQ ID NO: 48. 

25 3. S. aureus 

a. Substrate Binding Site 

A further embodiment relates to S. aureus Murl in which the binding site 
30 comprises two conserved cysteine residues, denoted Cys72 and Cysl84 in the 
sequence represented as SEQ ID NO: 46. A further embodiment comprises a 
binding site of S. aureus Murl wherein the binding site additionally comprises one 
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or more of the following amino acid residues: SerlO, Thr74, Thrl 16, Thrl 19, 
Glul51, Thrl85 and Hisl86 as represented by the structural coordinates of Figure 
14. In one embodiment, the binding site of S. aureus Murl complexes with D- 
glutamate and comprises amino acid residues Cys72 and Cysl84 as well as amino 
5 acid residues within 5 A of the Cys72 and Cysl84 residues as represented by the 
structural coordinates of Figure 15. In one embodiment, the binding site of the 5. 
aureus Murl additionally comprises one or more of the following amino acid 
residues: SerlO, Thr74, Thrl 16, Thrl 19, GlulSl, Thrl 85 and Hisl86. In this 
embodiment, the binding site complexes with D-glutamate and comprises amino 

10 acid residues Cys72 and Cysl84 and at least one (i.e., one or more) of the following 
amino acid residues: SerlO, Thr74, Thrl 16, Thrl 19, Glul51, Thrl85 and Hisl86, 
which can be present in any combination. In a further embodiment, the binding site 
includes amino acid residues SerlO, Pro40, Tyr41, Gly42, Cys72, Asn73, Thr74, 
Thrl 16, Thrl 19, GlulSl, Cysl84, Thrl 85 and Hisl86. 

15 In a further embodiment, the binding site of the Murl comprises two 

hydrogen bond TRIADs, which occur close to the conserved cysteine residues of the 
binding site. Specifically, on one side (Cysl84) of the binding site, the TRIAD is 
Glyl51-Thrl85-Hisl86 and on the other side (Cys72), the TRIAD is Thr74-Thrl 16- 
Thrl 19. Thus, in a specific embodiment of this invention, the binding site of S. 

20 aureus Murl complexes with D-glutamate and comprises amino acid residues Cys72 
and Cysl84 and additionally comprises at least one (i.e., one or more) of the 
following: amino acid residues SerlO, Thr74, Thrl 16, Thrl 19, Glul51, Thrl 85 and 
Hisl86. 

The threonine residues have features of interest as they relate to the binding 
25 site of S. aureus Murl. On one side (Cysl84) of the binding site, Thrl 85 is H- 

bonded to Hisl86 and Hisl86 is further H-bonded to GlulSl. The hydroxyl oxygen 
(O) of Thrl 85 is 2.9 A away from the amino nitrogen (N) of the substrate (i.e., 
glutamate) and 4.4 A from the sulfur (S) atom of Cysl84, which is 4.1 A from the 
nitrogen of His 186. 

30 On the other side of the binding site (Cys72), Thr74 is H-bonded to Thrl 16 

and further H-bonded to Thrl 19. The hydroxyl O of Thr74 is H-bonded to one of 
the carboxylate oxygen atoms of the substrate and is 4.3 A away from the S atom of 
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Cys72. Analysis showed that the three hydroxyl oxygens form a triangle, all within 
less than 3.4 A from one another. 

The two TRIADs may play important roles in altering the pKa of the two 
binding site cysteine residues (in addition to that of the neighboring hydrophobic 
5 . core), facilitating the proton transfer during catalysis or both. 

Another embodiment of the present invention is crystallized S. aureus Murl, 
comprising a binding site comprising the amino acid residues SerlO, Pro40, Tyr41, 
Gly42, Cys72, Asn73, Thr74, Thrll6, Thrll9, Glul51, Cysl84, Thrl85 and Hisl86 
of the amino acid sequence represented by SEQ ID NO: 46. This crystal may be 

10 further complexed with D-glutamate. Additionally, the binding site of the crystal 

complexed with D-glutamate can comprise the amino acid residues: Cys72, Cysl84, 
and one or more of the following amino acid residues: SerlO, Thr74, Thrl 16, 
Thrl 19, Glul51, Thrl85 and Hisl86. A further embodiment is one in which the 
binding site additionally comprises at least one of the following amino acid residues: 

15 Asp9, Glyl 1, Vall2, Glyl3, Glyl4, Arg38, Cys39, Pro40, Tyr41, Gly42, Pro43, 
Arg44, Ala71, Asn73, Ala75, Val94, Glyl 15, Vail 47, Glyl83 and Tyrl87. A 
further embodiment is one in which the binding site additionally comprises at least 
one of the following amino acid residues: Ile8, Thrl 6, Vail 7, Tyr32, Asp35, Val49, 
Ile70, Ser76, Ala77, Leull4, Glyl 18, Ilel20, Tyrl25, Prol44, Prol48, Leul82 and 

20 Ser207. The crystals comprising binding sites represented by the above amino acid 
residues comprise the amino acid sequence is that represented by SEQ ID NO: 46; 
or an amino acid sequence is at least 75%, 80%, 85%, 90%, 95%, 98%, or 99% 
homologous to the amino acid sequence that is represented by SEQ ID NO: 46. 
Location and geometry of a binding site of the enzyme are also defined. 

25 Two conserved cysteine (Cys) residues were identified as the residues responsible 

for the (de)protonation of the alpha-carbon of the substrate during catalysis, which is 
consistent with the two-base mechanism proposed for function of the enzyme in its 
role as a racemase. The two binding site cysteines are Cys72 and Cys 184, which are 
about 7.3 Angstroms (A) apart (Cot-Cot distance, i.e., the distance between 

30 Cot atoms). Other amino acid residues identified in the vicinity of the binding site 
include SerlO, Thr74, Thrl 16, Thrl 19, Glul51, Thrl 85 and Hisl86. The bound 
substrate/product, D-glutamate, is located between the two conserved cysteine 
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residues. Further detail of the binding site is as follows: looking down the axis 
defined by the Glul51 along the y and a carbon, the two cysteines exhibit a rather 
symmetrical environment; each has a hydrophobic core with respect to the substrate 
and a neighboring threonine residue not far from the substrate C-a (3.4 - 4.3 A), 
5 respectively. 

Analysis of the crystal structure of the S. aureus Murl indicates that the 
following amino acid residues are within 10 A of the bound D-glutamate: 8-17, 32- 
44, 49, 52, 71-79, 94-95, 114-120, 122, 125, 184-190, and 207-208. Analysis of the 
crystal structure also shows that the following amino acid residues are within 4 A of 
10 the bound D-glutamate: 9-10, 39-42, 72-74, 116, 1 19, and 184-186. Of interest is 
the fact that only one acidic amino acid residue (Asp9) is present in the structure 
surrounding the binding site Cys72, which serves as an anchoring point for the 
amino group of the substrate/product with a polar (charged) interaction between the 
amino nitrogen atom of D-glutamate and the delta oxygen atom of Asp9. 

15 

b. Intermolecular Dimer Interface 

One embodiment of the present invention is crystallized S. aureus Murl 
having an intermolecular interface comprising amino acid residues: Gln24, Leu25, 
20 Pro26, Asn27, Glu81, Glu84, Ser90, Val91, Ile92, Glu96, Pro97, ArglOO, ThrlOl, 
Metl04, Argl31, Ilel32, Asnl33, Prol34, Arg213, Glu214, Ser216, Ala217, 
Leu218, Thr220, Phe221, Ala226, Ser227, and Tyr228 of SEQ ID NO: 46. 

c- Intradomain Binding Site 

25 

One embodiment of the present invention is crystallized S. aureus Murl 
having an intradomain dimer interface comprising amino acid residues: Asp9, SerlO, 
Glyll, Vall2, Glyl3, Glyl4, Leul5, Thrl6, Vall7, Glu20, Cys39, Pro40, Tyr41, 
Gly42, Pro43, Arg44, Pro45, Gly46, Val49, Ala71, Cys72, Asn73, Thr74, Ala77, 
30 Val78, Val94, Ile95, Glu210, Thr211, Ala212, Arg213, Glu214, Ala217, Leu218, 
His244, Asn247, Ile248, Glu251, and Trp252 of a first domain and amino acid 
residues Ile95, Glu96, Pro97, Gly98, Ala99, ArglOO, ThrlOl, Alal02, Ilel03, 
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Metl04, Thrl05, Thrll6, Glull7, Glyll8, Thrll9, Serl22, Glul23, Alal24, 
Tyrl25, Hisl28, Argl31, Ilel32, Prol44, Glyl45, Phel46 5 Vall47, VallSO, Glul51, 
Glnl52, Metl53, Ilel81, Leul82, Glyl83, Cysl84, Thrl85, Hisl86, Tyrl87, 
Prol88, Leul89, Ile205, Ser206, Ser207, Gly208, and Leu209 of a second domain, 
5 wherein the amino acid residues are represented by SEQ ID NO: 46. 



C. Atypical Gram negative - H. pylori 

10 As described in the examples that follow, Murl of H. pylori has been 

crystallized, and crystal structures (three-dimensional structure) determined. The 
structures determined represent that of Murl alone, and Murl in complex with an 
inhibitor or in complex with the enzyme substrate, glutamate, in which glutamate 
can be in the L- or D-form. NMR data suggests that Murl has stable structural 

1 5 elements in the absence of substrate. Crystallization of//, pylori Murl is described 
in Example 1 and Figures 4-7. 

Experimental results show that the unit cell of crystals of H. pylori Murl can 
be a monomer, a symmetrical or nonsymmetrical dimer, or a multimer, depending 
upon the crystallization conditions and that the active form of Murl from atypical 

20 bacterium is a dimer. Each monomer consists of two distinct and very similar 
alpha/beta type domains that interact at a molecular interface. 

The H. pylori Murl dimer is held together by a stable and conserved 
hydrophobic core created by a four helix bundle (e.g., two helices from each 
monomer, residues 143 to 169) that link the two domains and thereby the two 

25 monomers rather rigidly together. This arrangement leaves the other two domains 
of each monomer free to move in order to give access to the two active sites which 
occur at the dimer interface. NMR data shows that the protein is also a dimer in the 
absence of a ligand, and is more flexible in this state than when ligand is bound. 
Thus, the molecular interface of Murl from an atypical bacterium exists at the 

30 junctions of two domains and functions as a flexible element by which a change in 
conformation opens the substrate binding site, allowing substrate to bind Murl. The 
degree of movement of the molecular interface of H. pylori Murl is less flexible that 
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of typical Gram negative bacteria due to the strict requirement of an activator for 
substrate binding and enzyme activity in typical Gram negative bacteria. 

The amino acid residues which comprise the binding site of Murl (Murl) are 
conserved in seventeen strains of H. pylori that have been sequenced. 
5 One embodiment of the present invention relates to a binding site of a H. 

pylori Murl. As used herein, the term "binding site" refers to a specific region (or 
atom) of Murl that interacts with another molecular entity. In certain embodiments, 
a binding site may comprise, or be defined by, the three dimensional arrangement of 
one or more amino acid residues within a folded polypeptide. 

10 In the present invention, a substrate can be a compound such as L-glutamate 

or D-glutamate which Murl reversibly converts from the R- to S-enantiomer. Thus, 
a substrate can also act a product. The substrate can be a naturally-occurring or 
artificial compound. 

In the present invention, an inhibitor can be a compound which also may 

1 5 undergo a catalytic reaction, bind to the substrate binding site, or another site on 
Murl and which competes with substrate turnover of glutamate. Inhibitors of the 
present invention can be a compound such as L-serine-O-sulfate, D-serine-O-sulfate, 
D-aspartate, L-aspartate, tartrate, citrate, phosphate, sulfate, aziridino-glutamate, N- 
hydroxyglutamate, or 3-chloroglutamate. The inhibitor can be a naturally-occurring 

20 or artificial compound. 

One embodiment of the present invention relates to a binding site of H. 
pylori Murl, in which the binding site comprises two conserved cysteine residues, 
denoted Cys70 and Cysl81 as represented by the amino acid sequence of SEQ ID 
NO: 2. A further embodiment comprises a binding site oiH. pylori Murl wherein 

25 the binding site additionally comprises one or more of the following amino acid 

residues: Ser8, Thr72, Thrl 16, Thrl 19, Thrl82, Hisl83 and GlulSO as represented 
by the structural coordinates of Figure 4. In one embodiment, the binding site of the 
H. pylori Murl is complexed with D-glutamate and comprises amino acid residues 
Cys70 and Cysl81 as well as amino acid residues within 5 A of the Cys70 and 

30 Cysl81 residues as represented by the structural coordinates of Figure 5. In one 

embodiment, the binding site of H. pylori Murl additionally comprises one or more 
of the following amino acid residues: Ser8, Thr72, Thrl 16, Thrl 19, Thrl 82, His 183 
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and Glul50. In this embodiment, the binding site is complexed with D-glutamate 
and comprises amino acid residues Cys70 and Cysl81 and at least one (i.e., one or 
more) of the following amino acid residues: Ser8, Thr72, Thrl 16, Thrl 19, Thrl82, 
Hisl83 and Glul50, which can be present in any combination. In a further 
5 embodiment, the binding site includes amino acid residues Asp7, Ser8, Gly9, VallO, 
Glyl 1, Glyl2, Val37, Pro38, Tyr39, Gly40, Thr41, Ala69, Cys70, Asn71, Thr72, 
Ala73, Glyll5, Thrll6, Lysll7, Alall8, Thrll9, Vall46, Ilel49, Glul50, Glyl80, 
Cysl81, Thrl 82, and His 183. 

In a further embodiment, the binding site of the Murl comprises two 

1 0 hydrogen bond TRIADs, which occur close to the conserved cysteine residues of the 
binding site. Specifically, on one side (Cysl81) of the binding site, the TRIAD 
consists essentially of Thrl82-Hisl83-Glul50 and on the other side (Cys70), the 
TRIAD consists essentially of Thr72-Thrl 16-Thrl 19. Thus, in a specific 
embodiment of this invention, the binding site of K pylori Murl is complexed with 

15 D-glutamate and comprises amino acid residues Cys70 and Cysl81 and additionally 
comprises at least one (i.e., one or more) of the following: amino acid residues Ser8, 
Thr72, Thrl 16, Thrl 19, Thrl82, Hisl83, and Glul50. 

In addition, both TRIADs involve conserved residues throughout all 
available H. pylori Murl sequences. The threonine residues have features of interest 

20 as they relate to the binding site of the H. pylori Murl. On one side (Cysl81) of the 
binding site, Thrl 82 is H-bonded Hisl83 and Hisl83 is further bonded to Glul50. 
All three residues are conserved for all Murl proteins in H. pylori strains tested. The 
hydroxyl oxygen (O) of Thrl 82 is 3 A away from the amino nitrogen (N) of the 
substrate (i.e., glutamate) and 4.4 A from the sulfur (S) atom of Cysl81. 

25 On the other side of the binding site (Cys70), Thr72 is H-bonded to Thrl 16 

and further H-bonded to Thrl 19. The hydroxyl oxygen (O) of Thr72 is 3.4 A away 
from one of the carboxylate oxygen (O) atoms of the substrate and is 4.5 A away 
from the sulfur (S) atom of Cys70. Analysis showed that the three hydroxyl 
oxygens of the threonines form a triangle, all within less than 3.2 A from one 

30 another. 
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The two TRIADs may play important roles in altering the pKa of the two 
binding site cysteine residues (in addition to that of the neighboring hydrophobic 
core), facilitating the proton transfer during catalysis or both. 

Another embodiment of the present invention is crystallized Murl, 
5 comprising a binding site comprising the amino acid residues Asp7, Ser8, Gly9, 
VallO, Glyll, Glyl2, Val37, Pro38, Tyr39, Gly40, Thr41, Ala69, Cys70, Asn71, 
Thr72 5 Ala73, Glyl 15, Thrl 16, Lysll7, Alall8, Thrl 19, Vall46, Ilel49, GlulSO, 
Glyl80, Cysl81, Thrl82, and Hisl83 of the amino acid sequence represented by 
SEQ ID NO: 2. This crystal may be further complexed with D-glutamate. 

10 Additionally, the binding site of the crystal complexed with D-glutamate can 

comprise the amino acid residues: Cys70, Cysl81, and one or more of the following 
amino acid residues: Ser8, Thr72, Thrl 16, Thrl 19, Thrl82, Hisl83 and GlulSO. A 
further embodiment is one in which the binding site additionally comprises at least 
one of the following amino acid residues: Asp7, Gly9, VallO, Glyl 1, Glyl2, Arg36, 

15 Val37, Pro38, Tyr39, Gly40, Thr41, Lys42, Ala69, Asn71, Ala73, Val92, Glyl 15, 
Vall46, Glyl 80, Phel84. A further embodiment is one in which the binding site 
additionally comprises at least one of the following amino acid residues: Phe6, 
Asp7, Gly9, VallO, Glyl 1, Glyl2, Serl4, Vail 5, Tyr30, Asp33, Arg36, Val37, 
Pro38, Tyr39, Gly40, Thr41, Lys42, Ile47, Val68, Ala69, Asn71, Ala73, Ser74, 

20 Ala75, Val92, Leull4, Glyl 15, Lysll7, Alall8, Ilel20, Tyrl25, Serl43, Vall46, 
Prol47, Leul79, Glyl 80, Phel84, and Ser210. 

The crystals with binding sites comprising the above amino acid residues 
comprise the amino acid sequence that is represented by SEQ ID NO: 2; or an amino 
acid sequence that is at least 75%, 80%, 85%, 90%, 95%, or 98% homologous to the 

25 amino acid sequence that is represented by SEQ ID NO: 2. 

The crystals of the present invention diffract to about 1.86 A to about 3.0 A, 
and could be refined to about 1 .5 A. 

Location and geometry of a binding site of the enzyme are also defined. 
Two conserved cysteine (Cys) residues are identified as the residues responsible for 

30 the (de)protonation of the alpha-carbon of the substrate (D-glutamate) during 

catalysis, which is consistent with the two-base mechanism proposed for how the 
enzyme functions in its role as a racemase. The two binding site cysteines are 
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Cys70 and Cysl81, which are about 7.5 Angstroms (A) apart (Ca-Ca distance, i.e., 
the distance between Coc atoms). Other amino acid residues identified in the vicinity 
of the binding site include Ser8, Thr72, Thrl 82, Hisl83 and Glul50. The bound 
substrate/product, D-glutamate, is located between the two conserved cysteine 
5 residues. Further detail of the binding site is as follows: looking down the axis 
defined by the Glul 50 along the y and a carbon, the two cysteines exhibit a rather 
symmetrical environment; each has a hydrophobic core behind, with respect to the 
substrate and a neighboring threonine residue not far from the substrate C-alpha (3.7 
- 4 A), respectively. 

1 0 Analysis of the crystal structure of the H. pylori Murl indicates that the 

following amino acid residues are within 10 A of the bound D-glutamate: 6-15, 30- 
42, 47, 50, 68-76,210-211, 179-185, 114-120, 122, 92-93, 125,35, 37, 38, and 42. 
Note that residues 35, 37, 38 and 42 are from the other molecule of the dimer. 
Analysis of the crystal structure also shows that the following amino acid residues 

15 are within 4 A of the bound D-glutamate: 7-8,37-40,70-72, 1 16, 1 19, and 181-183. 
Of interest is the fact that only one acidic amino acid residue (Asp7) is present in the 
structure surrounding the binding site Cys70, which serves as an anchoring point for 
the amino group of the substrate/product with a polar (charged) interaction between 
the amino nitrogen atom of D-glutamate and the delta oxygen atom of Asp7. 

20 Location and geometry of the active site of the enzyme were also defined. 

Two key residues were identified as the residues responsible for creating the pocket 
for the inhibitor, interacting with the core ring system in its new position, and fixing 
the position of the central core. When inhibitor is bound, Cy of Leul86 and CP of 
Trp252 are 8.9 Angstroms (A) apart. 

25 

1. Substrate Binding Site 

One embodiment of the present invention is crystallized H. pylori Murl 
having a substrate binding site comprising the amino acid residues Cys70, Cysl81, 
30 Ser8, Thr72, Thrl 16, Thrl 19, Thrl82, Hisl83 and Glul50 of the amino acid 

sequence represented by SEQ ID NO: 2. Additionally, the binding site of the crystal 
complexed with D-glutamate can comprise amino acid residues: Cys70, Cysl81, and 
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one or more of the following amino acid residues: Ser8, Thr72, Thrl 16, Thrl 19, 
Thrl82, Hisl83 and Glul50. A further embodiment is one in which the binding site 
additionally comprises at least one of the following amino acid residues: Asp7, 
Gly9, VallO, Glyl 1, Glyl2, Arg36, Val37, Pro38, Tyr39, Gly40, Thr41, Lys42, 
5 Ala69, Asn71, Ala73, Val92, Glyl 15, Vall46, Glyl 80, Phel84. A further 

embodiment is one in which the binding site additionally comprises at least one of 
the following amino acid residues: Phe6, Asp7, Gly9, VallO, Glyl 1, Glyl 2, Serl4, 
Vall5, Tyr30, Asp33, Arg36, Val37, Pro38, Tyr39, Gly40, Thr41, Lys42, Ile47, 
Val68, Ala69, Asn71, Ala73, Ser74, Ala75, Val92, Leu 1 14, Glyl 15, Lysl 17, 
10 Alall8, Ilel20, Tyrl25, Serl43, Vall46, Prol47, Leul79, Glyl80, Phel84, and 
Ser210. 

2. Intermolecular Dimer Interface 

1 5 One embodiment of the present invention is crystallized H. pylori Murl 

having a intermolecular dimer interface comprising the amino acid residues Ser34, 
Ala35, Arg36, Val37, Pro38, Tyr39, Gly40, Thr41, Lys42, Asp43, Pro44, Thr46, 
Phe50, Lysll7, Asnl21, Serl43, Leul44, Prol47, Leul48, Glul50, Glul51, 
Serl52, Ilel53, Glyl57, Leul58, Thyrl61, Cysl62, Tyrl65, Tyrl66, Ser239, 

20 Gly240, Asp241, and Trp244 of SEQ ID NO: 2. 

3. Intradomain Binding Site 

One embodiment of the present invention is crystallized H. pylori Murl 
25 having a intradomain interface comprising the amino acid residues Asp7, Ser8, 
Gly9, VallO, Glyll, Glyl2, Phel3, Serl4, Vall5, Serl8, Lys21, Ala22, Val37, 
Pro38, Tyr39, Gly40, Thr41, Lys42, Asp43, Pro44, Ile47, Ala69, Cys70, Asn71, 
Thr72, Ser74, Ala75, Leu76, Gly91, Val92, Gly21 1, Asp212, Ala213, Ile214, 
Val215, Glu216, Tyr217, Leu218, Gln219, Gln220, Lys221, Glu251, Trp252, 
30 Leu253, Lys254, and Leu255 of a first domain, and amino acid residues Ile93, 
Glu94, Pro95, Ser96, Ile97, Leu98, Ala99, IlelOO, Argl02, Glnl03, Thrl 16, 
Lysl 17, Alall8, Thrl 19, Serl22, Asnl23, Alal24, Tyrl25, Alal28, Glnl31, 
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Glnl32, Serl43, Vail 46, Prol47, Ilel49, Glul50, Glul51, Serl52, Ilel78, Leul79, 
Glyl80, Cysl81, Thrl82, Hisl83, Phel84, Prol85, Leul86, Ile208, His209, Ser210, 
Gly21 1, and Asp212 of a second domain of SEQ ID NO: 2. 

5 4. Inhibitor Binding Site 

In the present invention, the amino acid residues to which the inhibitor binds 
an inhibitor binding site. 

One embodiment of the present invention is an inhibitor binding site of H. 

10 pylori Murl, in which the inhibitor binding site comprises two key residues, denoted 
as Leul86 and Trp252 in the amino acid sequence represented as SEQ ID NO: 2. In 
a further embodiment, the binding site of the H. pylori Murl is complexed with an 
inhibitor and comprises amino acid residues Leul86 and Trp252 as well as amino 
acid residues within 5 A, 7 A, 8 A, 10 A, 15 A, or 20 A of the Leul86 and Trp252 

1 5 residues of SEQ ID NO: 2. 

In one embodiment of the present invention, the inhibitor binding site of H. 
pylori Murl is complexed with an inhibitor, and the inhibitor binding site comprises 
amino acid residues Leul86 and Trp252, and additionally comprises at least one 
(i.e., one or more) of the following: amino acid residues VallO, Glyl 1, Phel3, 

20 Ilel49, Glul51, Serl52, Trp244, and Gln248 of SEQ ID NO: 2. In a further 

embodiment, the inhibitor binding site additionally comprises amino acid residues of 
at least one (i.e., one or more) of the following: amino acid residues: Glyl2, Serl4, 
Lysl7, Glul50, Leul54, Thrl82, and Hisl83. In a further embodiment, the inhibitor 
binding site additionally comprises amino acid residues of at least one (i.e., one or 

25 more) of the following: amino acid residues: Phel3, Trp244, and Leu253. 

Alternatively, the present invention is a crystal of Murl, comprising an inhibitor 
binding site comprising the amino acid residues VallO, Glyl 1, Glyl2, Phel3, Serl4, 
Lysl7, Ilel49, Glul50, Glul51, Serl52, Thrl82, Hisl83, Leul86, Trp244, Gln248, 
Trp252 and Leu253. 

30 



VI. Inhibitors 
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Inhibitors (such as small molecules, proteins, polypeptides, and peptides, 
etc.) can be utilized to bind a binding domain of Murl or the immediate surrounding 
area and prevent the movement of Murl such that substrates could not bind the 
5 substrate binding site (SBS). In doing so, blockage of Murl would prevent building 
of the bacterial cell wall in a manner similar to antibiotics. However, since Murl is 
ubiquitous to all bacteria, it is expected that an inhibitor that blocks the molecular 
interface will be a broad spectrum inhibitor that can block an entire genera of 
bacteria. Inhibitors of the present invention partially, or fully, block the molecular 

1 0 interface of Murl. 

The terms "small molecule" and "small compound" as used herein, are 
meant to refer to composition, that have a molecular weight of less than about 5 kD 
and most preferably less than about 2.5 kD. Small molecules and small compounds 
are used interchangeable herein. Small molecules can be nucleic acids, peptides, 

1 5 polypeptides, peptidomimetics, carbohydrates, lipids or other organic (carbon 
containing) or inorganic molecules. Many pharmaceutical companies have 
extensive libraries of chemical and/or biological mixtures comprising arrays of small 
molecules, often fungal, bacterial, or algal extracts, which can be screened to 
identify an inhibitor of Murl. 

20 An example of the present invention is crystallized Murl in complex with an 

inhibitor (e.g., an antibacterial binding agent, drug, etc.). A specific example is 
crystallized H. pylori Murl in complex with an inhibitor. In a further embodiment of 
the present invention, the crystallized complex is characterized by the structural 
coordinates depicted in Figure 7, in which the determined structures presented 

25 represent that of Murl in complex with an inhibitor. In a further embodiment, the 
inhibitor is a pyrimidinedione, such as an imidizolyl pyrimidinedione, a thiophenyl 
pyrimidinedione, a furanyl pyrimidinedione, a pyrazolo pyrimidinedione, or a 
pyrrolyl pyrimidinedione. 

In further embodiments, the pyrimidinedione is compound A, compound B, 

30 compound C, compound D, compound E, compound F, compound G, compound H, 
compound I, compound J, compound K, compound L, compound M, compound N, 
compound O, compound P, compound Q, compound R, compound S, compound T, 
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compound U, compound V, compound W 5 compound X, compound Y, compound 
Z, compound AA, compound AB, compound AC, compound AD, compound AE, 
compound AF, compound AG, compound AH, compound AI, compound AJ, or 
compound AK. The crystalline complexes are characterized by the space groups 
and cell dimensions depicted in Table 5. 



Table 5. 



Compound 



Structure 



Space 
Group 



Cell Dimensions 




P2,2,2, 



a= 61.41 b= 76.31 c= 108.92 
alpha= 90 beta- 90 gamma= 90 



B 




P2, 



a=57.27 b-76.59 c-60.00 
alpha= 90 beta= 98.74 gamma= 
90 




P2,2i2, 



a= 61.54 b= 75.80 c= 108.13 
alpha= 90 beta= 90 gamma- 90 
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D 




P2i2,2, 



a= 61.67 b= 75.63 c= 108.18 
alpha= 90 beta- 90 gamma= 90 




P2, 



a= 56.06 b= 62.10 c= 75.86 
alpha= 90 beta= 93.46 gamma= 
90 




P2,2,2 



a= 59.92 b= 78.21 c= 56.86 
alpha= 90 beta= 90 gamma= 9 




P2,2,2, 



a=61.9 b= 76.0 c= 108.9 
alpha= 90 beta— 90 gamma- 90 



H 




P2,2,2 



a=61.0b=78.7 c-57.0 

alpha= 90 beta- 90 gamma- 90 
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P2, 



a= 57.47 b= 77.03 c= 60.57 
alpha= 90 beta= 98.96 gamma= 
90 




P2 1 2,2, 



a= 61.21 b= 75.19c= 107.77 
alpha= 90 beta= 90 gamma= 90 



K 




P2i 



a= 56.61 b= 76.45 c= 60.24 
alpha= 90 beta= 99.05 gamma= 
90 




P2,2 1 2, 



a= 61.78 b= 75.59 c= 108.14 
alpha= 90 beta= 90 gamma= 90 



M 




P2, 



a= 56.04 b= 76.19 c= 60.16 
alpha= 90 beta= 98.56 gamma= 
90 
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N 


O 

o 

i II />— S ^— \ 


P2,2,2, 


a= 61.72 b= 75.47 c= 108.23 
alpha= 90 beta= 90 gamma= 90 


o 


N o y-"* N \ 




a— D /.UOJ D— / /.7JO C— Do. J J 

alpha= 90 beta= 97.91 gamma= 
90 


p 


T ^ 


P2,2,2 


a= 60.419 b= 78.322 c= 56.680 
alpha= 90 beta= 90.0 gamma= 
90 


Q 




P2, 


a= 57.557 b= 77.694 c= 60.192 
alpha= 90 beta= 98.999 gamma- 
9 


R 


-O 

O 


P2, 


a- 57.140 b= 62.179 c= 75.679 
alpha= 90 beta= 94.045 gamma= 
90 
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P2, 



a= 55.969 b- 62.338 c= 75.846 
alpha= 90 beta= 93.431 gamma= 
90 




P2, 



a= 57.50 b= 77.08 c= 60.62 
alpha= 90 beta= 98.93 gamma= 
90 



U 



v 



P2, 




a= 57.22 b= 62.07 c= 75.66 
alpha= 90 beta= 93.99 gamma= 
90 




P2,2!2 



a- 60.67 b- 77.47 c= 56.57 
alpha= 90 beta= 90 gamma= 90 



X 




P2 1 2,2 I 



a=61.8b= 75.9 c= 108.3 
alpha= 90 beta= 90 gamma=90 
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a= 60.675 b= 77.47 c= 56.57 
alpha= 90 beta= 90 gamma- 90 
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o^, o~Vn 


P2,2,2, 


a= 61.72 b= 75.48 c= 108.23 
alpha= 90 beta= 90 gamma- 90 



An exemplary example of the present invention is a crystal of H. pylori Murl 
complex ed with an antibacterial binding agent and the product (substrate) D- 
glutamate wherein the Murl is at least 70%, 75%, 80%, 85%, 90%, 95%, 98%, or 
5 99% homologous to the amino acid sequence represented by any one of SEQ ED 
NOS: 2-34, 40, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, and 74, or a portion 
thereof. In the present invention, the crystals diffract from about 0.8 A to about 3.5 
A. 

10 

VII. Computer- Assisted Methods of Identifying Murl Inhibitors 

Also the subject of this invention is a computer- assisted method for 
identifying a potential modifier, particularly a potential inhibitor, of Murl activity. 

1 5 The method comprises providing a computer modeling application with a set of 

relative structural coordinates of Murl, or a binding site thereof, wherein the set of 
relative structural coordinates is selected from a set of relative structural coordinates 
of Murl; supplying the computer modeling application with a set of structural 
coordinates of a candidate inhibitor of Murl; comparing the two sets of coordinates 

20 and determining whether the candidate inhibitor is expected to bind, or interfere 

with, the Murl, or a binding site thereof. Binding to, or interference with Murl, or a 
binding site thereof, is indicative of an inhibitor of Murl activity and, thus, 
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indicative of an antibacterial agent. In most instances, determining whether the 
candidate inhibitor is expected to bind, or interfere with, the Murl, or a binding site 
thereof, includes performing a fitting operation or comparison between the candidate 
inhibitor and the Murl, or binding site thereof, followed by computational analysis 
5 of the outcome of the comparison in order to determine the association between, or 
the interference with, the candidate inhibitor and Murl or binding site. A candidate 
inhibitor identified by such methods is a candidate antibacterial drug. Optionally, a 
candidate drug can be synthesized or otherwise obtained and further assessed (e.g., 
in vitro, in cells or in an appropriate animal model) for its ability to inhibit Murl. 

10 In a specific embodiment, the computer- assisted method of identifying an 

agent that is a binding agent of Murl comprises the steps of (1) supplying the 
computer modeling application the coordinates of a known agent that binds a 
molecular interface of Murl and the coordinates of Murl; (2) quantifying the fit of an 
agent that binds the molecular interface of Murl to Murl; (3) supplying the computer 

15 modeling application with a set of structural coordinates of an agent to be assessed 
to determine if it binds a molecular interface of Murl; (4) quantifying the fit of the 
test agent in the molecular interface using a fit function; (5) comparing the fit 
calculation for the known agent with that of the test agent; and (6) selecting a test 
agent that has a fit that is better than, or approximates the fit of the known agent. 

20 In a specific embodiment, the computer-assisted method of identifying an 

agent that is a binding agent of Murl comprises the steps of (1) supplying the 
computer modeling application the coordinates of an activator and/or a substrate, 
and a Murl, (2) quantifying the fit of a known binding agent of Murl to Murl, (3) 
supplying the computer modeling application with a set of structural coordinates of 

25 an agent to be assessed to determine if it binds a binding domain of Murl, (4) 
quantifying the fit of the test agent in the binding site using a fit function, (5) 
comparing the fit calculation for the known agent with that of the test agent, and (6) 
selecting a test agent that has a fit that is better than, or approximates the fit of the 
known agent. 

30 One embodiment of the present invention relates to a process which may be 

used to identify Murl inhibitors having the steps of: 
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(1) providing one or more molecular structures individually or as members 
of any suitable commercial or proprietary structure-searchable database of chemical 
compounds; 

(2) selecting the molecular structures from step (1) to be converted into 

5 three-dimensional structures and placed into a binding/accessory site of Murl such 
that the moiety is constrained to make the appropriate hydrogen bond interactions; 

(3) analyzing the remainder of the constrained molecular structure to 
determine if it contains any other suitably placed moiety or moieties which allows 
one to confirm whether the group(s) fit appropriately into the binding or accessory 

10 site of Murl; 

(4) analyzing the molecular structures selected from step (3) to determine if 
the distances and the polar/non-polar surface areas are checked to determine whether 
they are within the specified ranges. 

It would be apparent to one skilled in the art that the above steps do not need 
15 to be performed in the above order. Alternatively, one skilled in the art would 
recognize that fragments of the moiety or moieties described above could be 
sufficient to inhibit Murl activity if they substantially fill the binding domain (site) 
or accessory site, and such fragments could be tested in vitro for inhibitory activity. 
These fragments may be obtained by reference to the generic and specific examples 
20 provided in this application or by searching other structures that have the required 
inter-group distances and polar and non-polar surface areas. 

Bacterial Murl inhibitors may also be obtained by modifying compound 
structures to include the pharmacophore features described above. 

25 

VIII. Computer-Assisted Methods of Screening Murl Inhibitors 

One skilled in the art may use one of several methods to screen chemical entities 
or fragments for their ability to associate with Murl and more particularly with the 
30 individual binding domains of Murl. This process may begin, for example, by 

visual inspection of the substrate binding site on the computer screen based on Murl 
coordinates. Selected fragments or chemical entities may then be positioned relative 
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to the substrate binding site of MurL Docking may be accomplished using software 
such as Quanta and Sybyl, followed by energy minimization and molecular 
dynamics with standard molecular mechanics forcefields, such as CHARMM and 
AMBER. 

5 Specialized computer programs may also assist in the process of selecting 

fragments or chemical entities. These include: 

• GRID [P J. Goodford, "A Computational Procedure for Determining 
Energetically Favorable Binding Sites on Biologically Important 

10 Macromolecules', J. Med. Chem., 28:849-857 (1985)]. GRID is available 

from Oxford University, Oxford, UK. 

• MCSS [A. Miranker and M. Karplus, "Functionality Maps of Binding Sites: 
A Multiple Copy Simultaneous Search Method", Proteins: Structure, 
Function and Genetics, 1 1:29-34 (1991)]. MCSS is available from 

15 Molecular Simulations, Burlington, Mass. 

• AUTODOCK [D. S. Goodsell and A. J. Olsen, "Automated Docking of 
Substrates to Proteins by Simulated Annealing", Proteins: Structure. 
Function and Genetics, 8:195-202 (1990)]. AUTODOCK is available from 
Scripps Research Institute, La Jolla, Calif. 

20 • DOCK [I. D. Kuntz et al, "A Geometric Approach to Macromolecule-Ligand 

Interactions", J. Mol. Biol., 161 :269-288 (1982)]. DOCK is available from 
University of California, San Francisco, Calif. 

Additional commercially available computer databases for small molecular 
25 compounds includes Cambridge Structural Database and Fine Chemical Database, 
for a review see Rusinko, A., {Chem, Des. Auto. News 8, 44-47 (1993)). 

Once suitable chemical entities or fragments have been selected, they can be 
assembled into a single compound or inhibitor. Assembly may be proceeded by 
visual inspection of the relationship of the fragments to each other on the three- 
30 dimensional image displayed on a computer screen in relation to the structure 
coordinates of MurL This would be followed by manual model building using 
software such as Quanta or Sybyl. 
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Useful programs to aid one of skill in the art in connecting the individual 
chemical entities or fragments include: 

• CAVEAT [P.A. Bartlett et al, "CAVEAT: A Program to Facilitate the 

5 Structure-Derived Design of Biologically Active Molecules", in Molecular 

Recognition in Chemical and Biological Problems " 9 Special Pub., Royal 
Chem. Soc. 78, pp. 182-196 (1989)]. CAVEAT is available from the 
University of California, Berkeley, Calif 

• 3D Database systems such as MACCS-3D (MDL Information Systems, San 
10 Leandro, Calif.) This area is reviewed in Y.C. Martin, "3D Database 

Searching in Drug Design", J. Med. Chem. f 35:2145-2154 (1992). 

• HOOK (available from Molecular Simulations, Burlington, Mass.). 

Instead of proceeding to build a Murl inhibitor in a step-wise fashion one 
1 5 fragment or chemical entity at a time as described above, inhibitory or other type of 
binding compounds may be designed as a whole or "de novo" using either an empty 
active site or optionally including some portion(s) of a known inhibitor(s). These 
methods include: 

20 • LUDI [H.-J. Bohm, "The Computer Program LUDI: A New Method for the 

De Novo Design of Enzyme Inhibitors", J. Comp. Aid. Molec. Design 6:61- 
78 (1992). LUDI is available from Biosym Technologies, San Diego, Calif. 

• LEGEND [Y. Nishibata and A. Itai, Tetrahedron, 47:8985 (1991). 
LEGEND is available from Molecular Simulations, Burlington, Mass. 

25 • LeapFrog (available from Tripos Associates, St. Louis, Mo.) 

Other molecular modeling techniques may also be employed to screen for 
inhibitors of Murl. See, e.g., N.C. Cohen et al, "Molecular Modeling Software and 
Methods for Medicinal Chemistry", J. Med. Chem., 33:883-894 (1990). See also, 
30 M. A. Navia and M. A. Murcko, "The Use of Structural Information in Drug 

Design", Current Opinions in Structural Biology, 2:202-210 (1992). For example, 
where the structures of test compounds are known, a model of the test compound 
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may be superimposed over the model of the structure of the invention. Numerous 
methods and techniques are known in the art for performing this step. Any of these 
may be used. See, e.g., P.S. Farmer, Drug Design, Ariens, E. J., ed., Vol. 10, pp 
119-143 {Academic Press, New York, 1980); U.S. Pat. No. 5, 331,573; U.S. Pat. No. 
5 5,500,807; C. Verlinde, Structure, 2:577-587 (1994); and I. D. Kuntz, Science, 
257:1078-1082 (1992). The model building techniques and computer evaluation 
systems described herein are not a limitation on the present invention. 

A variety of conventional techniques may be used to carry out each of the 
above evaluations as well as the evaluations necessary in screening a candidate 

10 compound for ability to inhibit Murl. Generally, these techniques involve 

determining the location and binding proximity of a given moiety, the occupied 
space of a bound inhibitor, the amount of complementary contact surface between 
the inhibitor and protein, the deformation energy of binding of a given compound 
and some estimate of hydrogen bonding strength and/or electrostatic interaction 

15 energies. Examples of techniques useful in the above evaluations include: quantum 
mechanics, molecular mechanics, molecular dynamics, Monte Carlo sampling, 
systematic searches and distance geometry methods [G.R. Marshall, Ann. Rev. 
Pharmacol Toxicol., 1987, 27: 193]. Specific computer software has been 
developed for use in carrying out these methods. Examples of programs designed 

20 for such uses include: 

• Gaussian 92 [M.J. Frisch, Gaussian, Inc., Pittsburgh, PA. ©1993] 

• AMBER [P. A. Kollman, University of California at San Francisco, 
©1993] 

25 • QUANTA/CHARMM [Molecular Simulations, Inc., San Diego, CA, 

©1992] 

Other hardware systems and software packages will be known and of evident 
applicability to those skilled in the art. 

The concept of the pharmacophore has been well described in the literature 
30 [D. Mayer, C.B. Naylor, I. Motoc, and G.R. Marshall, J. Comp. Aided Molec. 

Design, 1987, 1 : 3; A. Hopfinger and B.J. Burke, in Concepts and Applications of 
Molecular Similarity , 1990, M.A. Johnson and G.M. Maggiora, Ed., Wiley]. 
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Different classes of Murl inhibitors of this invention may also use different scaffolds 
or core structures, but all of these cores will allow the necessary moieties to be 
placed in the active site such that the specific interactions necessary for binding 
result. These compounds are best defined in terms of their ability to match a 
pharmacophore, i.e., their structural identity relative to the shape and properties of 
the binding domain of the Murl. 

Distances to or from any given group are calculated from the center of mass 
of that group. The term "center of mass" refers to a point in three-dimensional space 
which represents a weighted average position of the masses that make up an object. 
Distances between groups may readily be determined using any modeling software 
and other suitable chemical structure software. In addition, specialized, 
commercially-available pharmacophore modeling software enables one to determine 
pharmacophore models from a variety of structural information and data. The 
software may also be used to search a database of three-dimensional structures in 
order to identify compounds that meet specific pharmacophore requirements. 
Examples of this software include: 

• DISCO [Martin, Y.C., Bures, M.G., Danaher, E.A., DeLazzer, J, 
Lico, A., Pavlik, P.A., J. Comput. Aided Mol. Design, 1993, 7:83] 
DISCO is available from Tripos Associates, St. Louis, MO. 

• CHEM-X, which is developed and distributed by Chemical Design 
Ltd., Oxon, UK and Mahwah, NJ. 

• APEX-3D, which is part of the Insight molecular modeling program, 
distributed by Molecular Simulations, Inc., San Diego, CA. 

• CATALYST [Sprague, P.W., Perspectives in Drug Discovery and 
Design, 1995, 3: 1; Muller, K., Ed., ESCOM, Leiden] CATALYST 
is distributed by Molecular Simulations, Inc., San Diego, CA. 

• UNITY, which is available from Tripos Associates, St. Louis, MO. 

A typical hydrogen bond acceptor (HBA) is an oxygen or nitrogen, 
especially an oxygen or nitrogen that is sp 2 -hybridized or an ether oxygen. A typical 
hydrogen bond donor (HBD) is an oxygen or nitrogen that bears a hydrogen. 
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During binding, the pharmacophore features of the compounds will occupy certain 
regions or pockets of the binding site. In this region, the interaction of the 
compounds with the surrounding environment is primarily of a hydrophobic nature. 
The pharmacophore features of the present compounds are not limited to 
5 distinct chemical moieties within the same compound. A chemical moiety may 
serve as parts of two pharmacophore features. 

Thus, using these computer evaluation systems, a large number of 
compounds may be quickly and easily examined and expensive and lengthy 
biochemical testing avoided. Moreover, the need for actual synthesis of many 

10 compounds is effectively eliminated. 

One design approach is to probe Murl of the invention with molecules 
composed of a variety of different chemical entities to determine optimal sites for 
interaction between candidate Murl binding agents and the enzyme. For example, 
high resolution X-ray diffraction data collected from crystals saturated with solvent 

15 allows the determination of where each type of molecule binds. Small molecules 
that bind tightly to those sites can then be designed and synthesized and tested for 
their Murl binding activity. (Bugg et al., Scientific American, December: 92-98 
(1993); West et al., TIPS, 16: 67-74 (1995)). 

This invention also enables the development of compounds that can 

20 isomerize to short-lived reaction intermediates in the chemical reaction of a substrate 
or other compound that binds Murl. Thus, it is possible to carry out time-dependent 
analysis of structural changes in Murl during its interaction with other molecules. 
The reaction intermediates of Murl can also be deduced from the reaction product in 
co-complex with Murl. Such information is useful to design improved analogues of 

25 known Murl inhibitors or to design novel classes of binding agents based on the 
reaction intermediates of the Murl enzyme and Murl binding agent co-complex. 

Another approach made possible by this invention, is to screen 
computationally small molecule databases for chemical entities or compounds that 
can bind in whole, or in part, to a binding site or an accessory site of Murl. In this 

30 screening, the quality of fit of such entities or compounds to the binding site may be 
judged either by shape complementarity [R.L. DesJarlais et al., J. Med. Chem. 31; 
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722-729 (1988)] or by estimated interaction energy [E. C. Meng et al., Comp, 
Chem., 13;505-524(1992)]. 

Thus, the Murl structure provided herein permits the screening of known 
molecules and/or the designing of new molecules which bind to the Murl structure, 
5 particularly at the active site, via the use of computerized evaluation systems. In one 
example, the sequence of Murl, and the Murl structure (e.g., atomic coordinates of 
Murl and/or the atomic coordinate of the binding site cavity, bond angles, dihedral 
angles, distances between atoms in the binding site region, etc. as provided by 
Figure 4 may be input. Thus, a machine readable medium may be encoded with 

10 data representing the coordinates of Figure 7 in this process. The computer then 

generates structural details of the site into which a test compound should fit, thereby 
enabling the determination of the complementary structural details of the test 
compound/ inhibitor. 

In one embodiment, the inhibitor belongs to the class of compounds known 

15 as pyrimidinediones found to inhibit H. pylori Murl. In specific embodiments, the 
pyrimidinedione is compound A, compound B, compound C, compound D, 
compound E, compound F, compound G, compound H, compound I, compound J, 
compound K, compound L, compound M, compound N, compound O, compound P, 
compound Q, compound R, compound S, compound T, compound U, compound V, 

20 compound W, compound X, compound Y, compound Z, compound AA, compound 
AB, compound AC, compound AD, compound AE, compound AF, compound AG, 
compound AH, compound AI, compound AJ, or compound AK. The crystalline 
complexes are characterized by the space groups and cell dimensions depicted in 
Table 5. 

25 



IX. Computer-Assisted Methods of Designing and Making Murl Inhibitors 

In one embodiment, the present invention relates to computer-assisted design 
30 of binding agents of Murl, including molecules that fit into or bind to the binding 
site of Murl, or a portion of Murl which acts as a binding site or accessory site. A 
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binding agent of the present invention can be an antibacterial binding agent, such as 
a pyrimidinedione, that partially or totally inhibits the enzymatic activity of Murl. 

One embodiment of the present invention is a disk on which is stored the 
structural coordinates of Murl, wherein the structural coordinates may be used in a 
5 wide variety of computer-assisted methods, such as those described herein. 

The crystal structure of Murl, and the binding site thereof described herein 
are useful for the design of agents, particularly selective inhibitory agents, which 
inhibit Murl, and, thus, could act as antibacterial agents. In a related embodiment, 
the present invention encompasses a method for structure-based drug design for an 

1 0 agent that inhibits Murl activity. 

More particularly, the design of compounds that inhibit Murl according to 
this invention generally involves consideration of two factors. First, the compound 
must be capable of physically and structurally associating with Murl via covalent 
and/or non-covalent interactions. Non-covalent molecular interactions important in 

15 the association of Murl with its substrate, or inhibitor, include hydrogen bonding, 
van der Waals and hydrophobic interactions. 

Second, the compound must be able to assume a conformation that allows it 
to associate with Murl. Although certain portions of the compound will not directly 
participate in this association with Murl, those portions may still influence the 

20 overall conformation of the molecule. This, in turn, may have a significant impact 
on potency. Such conformational requirements include the overall three- 
dimensional structure and orientation of the chemical entity or compound in relation 
to all or a portion of a binding site, e.g., a substrate binding site, an activator binding 
site, an intradomain interface, an intermolecular dimer interface, or accessory site 

25 thereof, of Murl, or the spacing between functional groups of a compound 
comprising several chemical entities that directly interact with Murl. 

The potential inhibitory effect of a chemical compound on Murl may be 
estimated prior to its synthesis and testing by the use of computer modeling 
techniques. If the theoretical structure of the given compound suggests insufficient 

30 interaction and association between it and Murl, synthesis and testing of the 
compound is obviated. However, if computer modeling indicates a strong 
interaction, the molecule may then be synthesized and tested for its ability to bind to 
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Murl in a suitable assay. In this manner, synthesis of inactive compounds may be 
avoided. 

In another embodiment, the invention is a computer-assisted method for 
designing a candidate modifier, particularly a candidate inhibitor, of Murl. In the 
5 method, a computer modeling application is supplied with a set of relative structural 
coordinates of Murl or a binding domain thereof, and a set of structural coordinates 
of a candidate inhibitor of Murl. The set of relative structural coordinates is selected 
from sets of relative structural coordinates of a Murl as depicted in Figures 4-19, and 
with a set of structural coordinates of a candidate inhibitor of MurL The potential 

10 interference of the candidate inhibitor with the activity of Murl is assessed and the 
candidate inhibitor is structurally modified as needed to produce a set of structural 
coordinates for a modified candidate inhibitor. The modified candidate inhibitor is 
further assessed, using computer-assisted techniques and, optionally, in vitro and/or 
in vivo testing and modified further, if needed, to produce a modified candidate 

1 5 inhibitor with enhanced properties (e.g., greater inhibitory activity than the starting 
candidate inhibitor). 

This invention also enables the development of compounds that can 
isomerize to short-lived reaction intermediates in the chemical reaction of a substrate 
or other compound that binds to or with Murl. This makes it possible to carry out 

20 time-dependent analysis of structural changes in Murl during its interaction with 
other molecules. The reaction intermediates of Murl can also be deduced from the 
reaction product in complex with MurL Such information is useful to design 
improved analogues of known racemase inhibitors, or to design novel classes of 
inhibitors based on the reaction intermediates of the Murl enzyme and 

25 Murl/inhibitor co-complex. This provides a novel route for designing Murl 
inhibitors with both high specificity and stability. 

An inhibitory or other binding compound of Murl may be computationally 
evaluated and designed by means of a series of steps in which chemical entities or 
fragments are screened and selected for their ability to associate with the individual 

30 binding domains (pockets) or other areas of Murl. 

In a related embodiment, the present invention encompasses a method for 
structure-based drug design for an agent that inhibits Murl activity, comprising 
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generating a compound structure using a crystalline form of Murl which can be used 
for X-ray or nuclear magnetic resonance (NMR) studies. In one embodiment, the 
coordinates of Figure 4 are used. Alternatively, coordinates having a root mean 
square deviation from the coordinates of Figure 4 with respect to conserved 
5 backbone atoms of the listed amino acid sequence of not more than 1.0 A, a root 
mean square deviation of not more than 1.5 A, a root mean square deviation of not 
more than 2.0 A, or a root mean square deviation of not more than 2.5 A. In a 
further embodiment, the method additionally comprises generating a conserved 
surface of the crystalline form of Murl on a computer screen, generating the spatial 

10 structure of test compounds on a computer screen, and determining if the 

compounds having a spatial structure fit the conserved surface. In the example, the 
conserved surface of the crystalline form of Murl has a binding site with a crystal 
structure which is created by atoms from the following amino acid residues: Asp7, 
Ser8, Gly9, VallO, Glyl 1, Glyl2, Val37, Pro38, Tyr39, Gly40, Thr41, Ala69, 

15 Cys70, Asn71, Thr72, Ala73, Glyl 15, Thrll6, Lysll7, Alall8, Thrll9, Vail 46, 
Ilel49, Glul50, Glyl 80, Cysl81, Thrl82, and Hisl83of SEQ ID NO: 2. 

In another embodiment, the present invention is a computer-assisted method 
of designing a candidate modifier, particularly a candidate inhibitor, of Murl. The 
method comprises supplying a computer modeling application with a set of relative 

20 structural coordinates of Murl, or a binding site thereof and a set of structural 

coordinates of a candidate inhibitor of Murl; computationally building an agent to 
be assessed for its ability to interfere with the Murl or a binding site thereof, wherein 
the resulting agent is represented by a set of structural coordinates; and determining 
whether the agent is expected to bind, or interfere with, the Murl, or the binding site, 

25 wherein if the agent is expected to bind, or interfere with, the Murl or the binding 
site, a candidate inhibitor has been designed. 

In certain embodiments the present invention relates to a method for 
generating 3-D atomic coordinates of a protein homologue or a variant of Murl 
using the X-ray coordinates of Murl described in any one of Figures 4-19, 

30 comprising, 

a. identifying one or more polypeptide sequences homologous to Murl; 
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b. aligning the sequences with the sequence of Murl which comprises a 
polypeptide with the amino acid sequence of any one of SEQ ID NOS: 2-34, 40, 44, 
46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, and 74; 

c identifying structurally conserved and structurally variable regions 
5 between the homologous sequence(s) and Murl; 

d. generating 3-D coordinates for structurally conserved residues of the 
homologous sequence(s) from those of Murl using coordinates of Murl, such as 
those listed in any one of Figures 4-19; 

e. generating conformations for helices, strands, loops, and/or turns in the 
10 structurally variable regions of the homologous sequence(s); 

f building side-chain conformations for the homologous sequence(s); and 

g. combining the 3-D coordinates of the conserved residues, loops and side- 
chain conformations to generate full or partial 3-D coordinates for the homologous 
sequences. In certain embodiments the method further comprises refining and 
1 5 evaluating the full or partial 3-D coordinates for percent homology with Murl. 

In one embodiment of the present invention relates to any computer-assisted 
method using known binding agents of Murl, such as L-glutamate, D-glutamate, L- 
serine-O-sulfate, and D-serine-O-sulfate, tartrate, citrate, phosphate, sulfate, D- 
aspartate, L-aspartate, aziridino-glutamate, N-hydroxyglutamate, 3-chloroglutamate, 
20 a pyrimidinedione, or UDP-MurNAc-Ala, to determine the fit of a known agent for 
comparison to a candidate inhibitor. 

One embodiment of the present invention is a disk on which is stored the 
structural coordinates of any one of Figures 4-19, wherein the structural coordinates 
may be used in a variety of computer- assisted methods, such as those described 
25 herein. 

In a specific embodiment, the computer-assisted method of designing an 
agent that binds Murl comprises the steps of (1) supplying to a computer modeling 
application a set of relative structural coordinates of a crystal of Murl; (2) 
computationally building an agent represented by a set of structural coordinates; and 
30 (3) determining whether the agent is expected to bind, or interfere with Murl, 

wherein if the agent is expected to bind Murl, an agent that binds Murl has been 
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designed. In a further embodiment of the present invention, the binding agent has 
been designed such that it directly binds to the amino acid residues forming a 
binding site of Murl or a surrounding area such that the enzyme is conformationally 
hindered and substrate cannot bind to the substrate binding site. The Murl can be 
5 from a Gram negative bacterium, a Gram positive bacterium, or an atypical 
bacterium. 

In a related embodiment, the present invention encompasses a method for 
structure-based drug design of an agent that binds to Murl, comprising generating a 
compound structure using a crystalline form of Murl wherein the crystalline form of 

10 the Murl is capable of being used for X-ray studies. In one embodiment, the 

coordinates of Figure 4 are used. Alternatively, coordinates having a root mean 
square deviation from the coordinates of Figure 4 with respect to conserved 
backbone atoms of the listed amino acid sequence of not more than 1.0 A, a root 
mean square deviation of not more than 1.5 A, a root mean square deviation of not 

15 more than 2.0 A, or a root mean square deviation of not more than 2.5 A. In a 
further embodiment, the method additionally comprises generating a conserved 
surface of the crystalline form of Murl on a computer screen, generating the spatial 
structure of test compounds on a computer screen, and determining if the 
compounds having a spatial structure fit the conserved surface. In a further 

20 embodiment, the method additionally comprises generating a conserved surface of 
the crystalline form of Murl on a computer screen, generating the spatial structure of 
test compounds on a computer screen, and determining if the compounds having a 
spatial structure fit the conserved surface. 

In another aspect, the Murl structure of the invention permit the design and 

25 identification of synthetic compounds and/or other molecules which have a shape 
complimentary to the conformation of a Murl binding site described herein. Using 
known computer systems, the coordinates of the Murl structure of the invention may 
be provided in machine readable form, the test compounds designed and/or screened 
and their conformations superimposed on the structure of the Murl of the invention. 

30 Subsequently, suitable candidates identified as above may be screened for the Murl 
binding, bioactivity, and stability. 
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In another embodiment, the present invention relates to a method of making 
a candidate modifier of Murl by chemical, enzymatic or other synthetic method. 
Candidate modifiers identified or designed as described herein can be made using 
techniques known to those of skill in the art. 
5 In another aspect, the Murl structure of the invention permits the design and 

identification of synthetic compounds and/or other molecules which have a shape 
complimentary to the conformation of the Murl binding site of the invention. Using 
known computer systems, the coordinates of the Murl structure of the invention may 
be provided in machine readable form, the test compounds designed and/or screened 

10 and their conformations superimposed on the structure of the Murl of the invention. 
Subsequently, suitable candidates identified as above may be screened for the 
desired Murl inhibitory bioactivity, and stability. 

Once identified and screened for biological activity, these inhibitors may be 
used therapeutically or prophylactically to block Murl activity and to treat bacterial 

15 infections in a mammalian subject. 

Once identified by the modeling techniques, the inhibitor may be tested for 
bioactivity using standard techniques. For example, the Murl structure of the 
invention may be used in binding assays using conventional formats to screen 
inhibitors. Suitable assays for use include, but are not limited to, the enzyme-linked 

20 immunosorbant assay (ELISA) or a fluorescence quench assay. Other assay formats 
may be used; these assay formats are not a limitation on the present invention. 

X. MAPS/Molecular Replacement/Dyndom 

25 

Murl may crystallize in more than one form. Therefore, the structural 
coordinates of Murl as described herein are particularly useful to solve the structure 
of additional crystal forms of Murl, or binding domains of additional crystal forms 
of Murl. Portions of Murl of the present invention function as the active site 
30 (substrate binding site). They may also be used to solve the structure of Murl 
mutants, Murl complexes, or of the crystalline form of other proteins with 
significant amino acid sequence homology or structural homology to a functional 
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domain of MurL In one embodiment, significant amino acid sequence homology 
comprises at least 70%, 75%, 80%, 85%, 90%, 95%, 98%, or 99% homology to any 
functional domain of MurL 

One method that may be employed for this purpose is molecular 
5 replacement. In this method, the unknown crystal structure, whether it is another 
crystal form of Murl, a Murl, or a Murl co-complex, or the crystal of some other 
protein with significant amino acid sequence homology to any functional domain of 
Murl, may be determined using the Murl structure coordinates of this invention. 
This method will provide an accurate structural form for the unknown crystal more 

10 quickly and efficiently than attempting to determine such information ab initio. 

MOLREP is an integrated molecular replacement program that finds 
molecular replacement solutions using a two-step procedure: (1) rotation function 
(RF) search orientation of model and (2) cross translation function (TF) and packing 
function (PF) search position of oriented model. The translation function checks 

15 several peaks of the rotation function by computing a correlation coefficient for each 
peak and sorting the result. The packing function is important in removing incorrect 
solutions that correspond to overlapping symmetry. MOLREP can be set to search 
for any number of molecules per asymmetric unit and will automatically stop when 
no further improvement of the solution can be achieved by adding additional 

20 molecules. The molecular replacement software is part of a CCP4 software package 
(Computer Computational Project, Number 4, 1994; "The CCP4 Suite: Programs for 
Protein Crystallography". Acta Cryst. D50: 760-763). 

DynDom is a fully- automated program that determines protein domains, 
hinge axes and amino acid residues involved in enzyme movement. DynDom can 

25 be used with two conformations of the same protein. Structures can be two X-ray 
structures or structures generated using simulation techniques such as molecular 
dynamics or normal mode analysis. The application of DynDom provides a view of 
the conformational change and visualizes movement of domains as quasi-rigid 
bodies. DynDom was used to show that Murl goes through a conformational change 

30 that corresponds to two rigid domains that move relative to each other through a 
molecular interface movement. The two conformations seen in the E. faecalis 
structure provided the most convincing case as seen in Figure 3. DynDom 
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illustrated that the different conformations seen with D-Glutamate versus L- 
Glutamate bound to the same protein and suggested that there is a domain 
movement associated with the catalytic cycle of the enzyme. The DynDom software 
is part of a CCP4 software package (Computer Computational Project, Number 4, 
5 1994; 'The CCP4 Suite: Programs for Protein Crystallography". Acta Cryst. D50: 
760-763). 

MAPS (Multiple Alignment of Protein Structures) is an automated program 
for comparisons of multiple protein structures based upon tertiary structures of 
crystal coordinates. When homologous proteins with common structural elements 

10 are available, the MAPS program can automatically superimpose the three- 
dimensional models, detect which residues are structurally equivalent among all the 
structures and provide the residue-to-residue alignment. The structurally equivalent 
residues are defined by the program according to the approximate position of both 
main-chain and side-chain atoms of all of the proteins. Thus, in cases in which 

1 5 different primary amino acid sequences are present, the program picks out common 
tertiary structures of the proteins. (See Figure 2, for example). Based on structure 
similarity, the program calculates a score of structure diversity which can be used to 
build a phylogenetic tree. 

In another aspect, the present invention provides a method involving 

20 molecular replacement to obtain structural information about a molecule or 

molecular complex of unknown structure using the software programs described 
above and the coordinates described herein. 

XI. Assay Methods 

25 

A. In vitro 

Another aspect of this invention involves a method for identifying inhibitors 
of Murl characterized by the crystal structure and novel binding domains described 
herein, and the inhibitors themselves. The novel Murl crystal structure of the 
30 invention permits the identification of inhibitors of Murl activity. Such inhibitors 
may bind to all, or a binding domain, of Murl, and may be competitive or non- 
competitive inhibitors. Once identified and screened for biological activity, these 
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inhibitors may be used therapeutically or prophylactically to block bacterial growth 
and spread. 

One design approach is to probe the Murl of the invention with molecules 
composed of a variety of different chemical entities to determine optimal sites for 
5 interaction between candidate Murl inhibitors and the enzyme. For example, high 
resolution X-ray diffraction and NMR data collected from crystals saturated with 
solvent makes it possible to determine where each type of solvent molecule resides 
within the protein. Small molecules that bind tightly to those sites can then be 
designed and synthesized and tested for their Murl inhibitor activity. 

10 One embodiment of the present invention relates to a method of identifying 

an agent that inhibits Murl comprising combining the bacterium with a test agent 
under conditions suitable for binding of an agent to the active site, and determining 
whether the test agent inhibits Murl activity, wherein if inhibition occurs, the test 
agent is an inhibitor of Murl activity. 

1 5 The present invention encompasses an in vitro assay to identify an inhibitor 

of Murl. The assay can be a single or double enzyme activity as described in detail 
in the Examples or an equivalent in vitro assay system wherein small molecules, 
proteins, or fragments thereof are added to bacterium prior to the addition of 
activator (in the case of Gram negative bacterium) and/or substrate. When growth 

20 of the bacterial cell wall is inhibited compared to the control (which lacks inhibitor) 
an inhibitor of Murl has been identified. 

One embodiment of the present invention includes a method of testing 
inhibitors that bind to the binding domain identified or produced by the computer- 
assisted models in an in vitro assay comprising culturing bacteria. 

25 In one representative embodiment, H. pylori Murl was tested for inactivation 

with a suicide inhibitor, L-serine-O sulfate, which is known to inhibit Murl from E. 
coli. The enzyme was incubated in the presence of 20 mM L-serine-O sulfate, and 
at different time intervals, aliquots were removed to determine residual activity. The 
initial velocity of purified recombinant H. pylori Murl protein was determined in the 

30 single enzyme coupled assay following incubation with the inhibitor L-serine-O- 
sulfate (LSOS). The control was incubated in an identical manner but without 
LSOS. 
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In another representative embodiment, the assay has the steps of culturing a 
control collection of Gram negative bacteria, such as E. coli, in the appropriate 
liquid or solid agar in the presence activator and substrate, culturing a test collection 
of the Gram negative bacteria in the in the appropriate liquid or solid agar in the 
5 presence activator, substrate, and inhibitor, and comparing the growth of the bacteria 
in the test culture to the control culture wherein if growth is inhibited, an inhibitor of 
Murl has been identified. 

In one exemplary embodiment, the assay has the steps of culturing a control 
collection of a Gram positive or atypical bacteria, such as S. aureus or H. pylori, 
10 culturing a control collection bacteria in the appropriate liquid or solid agar in the 
presence substrate, culturing a test collection of bacteria in the in the appropriate 
liquid or solid agar in the presence substrate, and inhibitor, and comparing the 
growth of the bacteria in the test culture to the control culture wherein if growth is 
inhibited, an inhibitor of the Murl hinge region has been identified. 

15 

B. In vivo 

One embodiment of the present invention relates to an in vivo analysis of the 
antibacterial activity of the binding agents comprising infecting a mammalian 
subject (preferably a non-human primate or a rodent) with a clinically relevant 

20 amount of bacteria sufficient to establish an infection in the subject. After the 

bacteria has established an infection, the inhibitor (e.g., antibacterial binding agent) 
will be administered to the subject. A separate control group will be administered a 
placebo. Tissue, blood, and blood products can be collected at various time points 
to determine the course of infection, and those inhibitors which reduce (partially or 

25 totally) the extent of infection are determined to be effective inhibitors of infections. 

XII. Pharmaceutical Compositions/Preparations/Packages 

30 Inhibitors of the invention can be formulated as a pharmaceutical 

composition for administration to mammalian subjects as a broad spectrum 
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therapeutic for bacterial infections. In a preferred embodiment, the pharmaceutical 
composition is formulated for intravenous or oral administration. 

The inhibitor can be present in a composition, such as a pharmaceutical 
composition, which also comprises, for example, a pharmaceutically acceptable 
5 carrier, a flavoring agent, or adjuvant. Pharmaceutically-acceptable carriers and 
their formulations are well-known and generally described in, for example, 
Remington's pharmaceutical Sciences (18 th Edition, ed. A. Gennaro, Mack 
Publishing Co., Easton, PA, 1990). One exemplary pharmaceutically acceptable 
carrier is physiological saline. Other acceptable examples of pharmaceutically 

10 acceptable carriers include, but are not limited to, ion exchangers, alumina, 

aluminum stearate, lecithin, serum proteins, such as human serum albumin, buffer 
substances such as phosphates, glycine, sorbic acid, potassium sorbate, partial 
glyceride mixtures of saturated vegetable fatty acids, water, salts or electrolytes, 
such as protamine sulfate, disodium hydrogen phosphate, potassium hydrogen 

15 phosphate, sodium chloride, zinc salts, colloidal silica, magnesium trisilcate, 

polyvinyl pyrrolidone, cellulose-based substances, polyethylene glycol, sodium 
carboxymethylcellulose, polyacrylates, waxes, polyethylene-polyoxytpropylene- 
block polymers, wool fat and self- emulsifying drug delivery systems (SEDDS) such 
as a-tocopherol, polyethyleneglycol 1000 succinate, or other similar polymeric 

20 delivery matrices. The pharmaceutical compositions of the present invention may 
be administered to a patient, together with a compound of the present invention. 
The pharmaceutical compositions and methods of this invention will be useful 
generally for controlling, treating or reducing the advancement, severity or effects of 
bacterial infections in vivo. Such pharmaceutical compositions can comprise two or 

25 more antibacterial agents as described herein. 

The compounds/compositions of the present invention are also useful as 
commercial reagents which effectively bind to Murl. As commercial reagents, the 
compounds of the present invention, and their derivatives, may be used to block 
Murl activity in biochemical or cellular assays for bacterial Murl or its homologs or 

30 may be derivatized to bind to a stable resin as a tethered substrate for affinity 

chromatography applications. These and other uses which characterize commercial 
Murl inhibitors will be evident to those of ordinary skill in the art. 
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The compounds/compositions of the present invention may be employed in a 
conventional manner for controlling bacterial infection levels in vivo and for treating 
diseases or reducing the advancement or severity of effects which are mediated by 
bacteria. Such methods of treatment, their dosage levels, modes of administration 
5 and requirements may be selected by one of ordinary skill in the art from available 
methods and techniques. 

Alternatively, the compounds/compositions of the present invention may be 
used in compositions and methods for treating or protecting individuals against 
bacterial infections or diseases over extended periods of time. The compounds may 

10 be employed in such compositions either alone or together with other compounds of 
this invention in a manner consistent with the conventional utilization of enzyme 
inhibitors in pharmaceutical compositions. For example, a compound of the present 
invention may be combined with pharmaceutically acceptable adjuvants 
conventionally employed in vaccines and administered in prophylactically effective 

15 amounts to protect individuals over an extended period of time against bacterial 
infections or diseases. 

One embodiment of the present invention is a pharmaceutical package 
comprising a pharmaceutical composition of a Murl inhibitor that binds to the hinge 
region and instructions for its use in the treatment of a bacterial infection. 

20 One embodiment of the present invention is a pharmaceutical composition 

comprising a Murl inhibitor for the treatment of bacterial infections. 

XIII. Methods of Treatine Bacterial Infections 

25 One embodiment of the present invention comprises a method of treating a 

subject having a bacterial infection comprising administering a pharmaceutical 
composition of the Murl inhibitor. In one embodiment, the bacterial infection is 
caused by a Gram negative, a Gram positive, or an atypical bacterium. This 
structure is clearly useful in the structure-based design of Murl inhibitors, which 

30 may be used as therapeutic agents against bacterial infections. In one embodiment, 
the bacterial infection is caused by Helicobacter pylori, Campylobacter jejuni, 
Porphyromonas gingivalis, Pseudomonas aeruginosa, Deinococcus radiodurans, 
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Borrelia burgdorferi, Treponema pallidum, Vibrio cholerae, Shewanella 
putrefaciens, Escherichia coli, Haemophilus influenzae, Mycobacterium 
tuberculosis, Mycobacterium leprae, Lactobacillus fermentum, Pediococcus, 
pentosacceus, Enterococcus fecalis, Enterococcus faecium, Streptococcus 
5 pyogenes, Streptococcus pneumoniae, Bacillus spaericus, Staphylococcus aureus, 
Staphylococcus haemolyticus, Bacillus anthracis, or Bacillus subtilus infection. 

Pharmaceutical compositions of the present invention also include 
compositions (formulations) which may be administered orally, parenterally, by 
inhalation spray, topically, via ophthalmic solution or ointment, rectally, 

10 intranasally, buccally, vaginally, via an implanted reservoir, intramuscularly, 

intraperitoneally, and intravenously to the subject. A patient of the present invention 
is preferably a mammal. In a further embodiment, the mammal is a human, a 
primate, a dog, a horse, a cow, a sheep, a rat, a mouse, a pig, etc. 

In one embodiment of the present invention, the Murl inhibitor is 

15 administered continuously to the subject. In one embodiment of the present 

invention, the Murl inhibitor is administered to the subject every 1-24 hours. In one 
embodiment, administration continues until the bacterial infection is eradicated. 

Dosage levels would be apparent to one of ordinary skill in the art and would 
be determined based on a variety of factors, such as body weight of the individual, 

20 general health, age, the activity of the specific compound employed, sex, diet, time 
of administration, rate of excretion, drug combination, the severity and course of the 
disease, and the patient's disposition to the disease and the judgment of the treating 
physician. The amount of active ingredient that may be combined with the carrier 
materials to produce a single dosage form with vary depending upon the host treated 

25 and the particular mode of administration. A typical compound preparation will 
contain from about 5% to about 95% active compound (w/w). Preferably, such 
preparations contain from about 20% to about 80% active compound. 

Upon improvement of a patient's condition, a maintenance dose of a 
compound or composition of the present invention may be administered, if 

30 necessary; and the dosage, the dosage form, or frequency of administration, or a 
combination thereof, may be modified. In some cases, the subject may require 
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intermittent treatment on a long-term basis upon any recurrence of disease 
symptoms. 

By "treating" a patient suffering from a bacterial infection it is meant that the 
patient's symptoms are partially or totally alleviated, or remain static following 
5 treatment according to the invention. A patient who has been treated will exhibit a 
partial or total alleviation of symptoms and/or bacterial load. The term "treatment' 1 
is intended to encompass prophylaxis, therapy and cure. 

A "patient" or "subject" to be treated by the subject method can mean either 
a human or non-human animal. The mammal can be a primate (e.g., a human, a 
10 chimpanzee, a gorilla, a monkey, etc.), a domesticated animal (e.g., a dog, a horse, a 
cat, a pig, a cow), or a rodent (e.g., a mouse or a rat), etc. 

The phrase "therapeutically effective amount" as used herein means that 
amount of a compound, material, or composition comprising a compound of the 
present invention which is effective for producing a desired therapeutic effect by 
15 blocking cell wall synthesis of bacterial cells in a patient. 

Each of the compositions of the present invention can be used as a 
composition when combined with a pharmaceutically acceptable carrier or excipient. 
"Carrier" and "excipient" are used interchangeably herein. 

The phrase "pharmaceutically acceptable" is employed herein to refer to 
20 those compounds, materials, compositions, and/or dosage forms which are, within 
the scope of sound medical judgment, suitable for use in contact with the tissues of 
human beings and animals without excessive toxicity, irritation, allergic response, or 
other problem or complication, commensurate with a reasonable benefit/risk ratio. 
"Pharmaceutically acceptable carrier" is defined herein as a carrier that is 
25 physiologically acceptable to the recipient and that does not interfere with, or 
destroy, the therapeutic properties of the Murl inhibitor with which it is 
administered. 

XIV. Methods for Conducting Business 

30 

One embodiment of the present invention encompasses a method for 
conducting a pharmaceutical business having the steps of isolating one or more Murl 
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inhibitors that bind to Murl expressed in bacteria; generating a composition 
comprising a Murl inhibitor, which composition has a minimal inhibitory 
concentration (MIC) or 8 |ig/mL or less; conducting therapeutic profiling of the 
composition, for efficacy and toxicity in animals; preparing a package insert 
5 describing the composition for treatment of bacterial infections; and, marketing the 
composition for treatment of bacterial infections. 

One embodiment of the present invention encompasses a method for 
conducting a life science business having the steps of isolating one or more Murl 
inhibitors that bind to Murl expressed in bacteria; generating a composition 

1 0 comprising a Murl inhibitor, which composition has a minimal inhibitory 

concentration (MIC) of 8 |ag/mL or less; licensing, jointly developing or selling, to a 
third party, the rights for selling the composition. 

One embodiment of the present invention encompasses a method for 
conducting pharmaceutical business having the steps of (a) isolating one or more 

15 Murl inhibitors that bind to Murl expressed in bacteria with a of 1 ^iM or less; (b) 
generating a composition comprising said Murl inhibitors, which composition has a 
minimal inhibitory concentration (MIC) of 8 |ag/mL or less; (c) conducting 
therapeutic profiling of the composition for efficacy and toxicity in animals; (d) 
preparing a package insert describing the use of the composition for antibacterial 

20 therapy; and, (e) marketing the composition for use as an antibacterial agent. 

One embodiment of the present invention encompasses a method for 
conducting a life science business having the steps of (a) isolating one or more Murl 
inhibitors that bind to Murl expressed in bacteria with a Ka of 1 |oM or less; (b) 
generating a composition comprising said Murl inhibitors, which composition has a 

25 minimal inhibitory concentration (MIC) of 8 jag/mL or less; and (c) licensing, jointly 
developing or selling, to a third party, the rights for selling the composition. 

XV. Equivalents 

30 It will be apparent to those skilled in the art that various modifications and 

variations can be made in the present invention without departing from the scope or 
spirit of the invention. Other embodiments of the invention will be apparent to those 
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skilled in the art form consideration of the specification and practice of the invention 
disclosed herein. It is intended that the specification and examples be considered as 
exemplary only, with the true scope and spirit of the invention being indicated by the 
claims. 

5 

EXAMPLES 

The present invention is illustrated by the following examples which should 
not be construed as limiting in any way. The contents of all cited references 
(including literature references, issued patents, published patent applications as cited 
10 throughout this application) are hereby expressly incorporated by reference. 

The practice of the present invention will employ, unless otherwise 
indicated, conventional techniques of cell biology, cell culture, molecular biology, 
microbiology and recombinant DNA, X-ray crystallography, and molecular 
modeling which are within the skill of the art. Such techniques are explained fully in 

15 the literature. See, for example, Molecular Cloning: A Laboratory Manual, 2nd Ed., 
ed. by Sambrook, Fritsch and Maniatis (Cold Spring Harbor Laboratory Press: 
1989); DNA Cloning, Volumes I and II (D. N. Glover ed., 1985); Oligonucleotide 
Synthesis (M. J. Gait ed., 1984); Mullis et al. U.S. Patent No: 4,683,195; Nucleic 
Acid Hybridization (B. D. Hames & S. J. Higgins eds. 1984); Transcription And 

20 Translation (B. D. Hames & S. J. Higgins eds. 1984); B. Perbal, A Practical Guide 
To Molecular Cloning (1984); the treatise, Methods In Enzymology (Academic 
Press, Inc., N.Y.); Methods In Enzymology, Vols. 154 and 155 (Wu et al. eds.), 
Crystallography made crystal clear : a guide for users of macromolecular models 
(Gale Rhodes, 2nd Ed. San Diego: Academic Press, 2000). 

25 

Example 1: Cloning, Crystallization and Characterization of 77. pylori Murl 

A. Cloning, purification, and characterization of the gene encoding 
30 the Murl of H. pylori 
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Cloning and purification of Murl has been previously described in U.S. 
Provisional Applications 60/435,087 and 60/435,527. 

The H. pylori genome contains an open reading frame (ORF) of 255 amino 
acids that was found to have homology to the Staphylococcus haemolyticus Murl 
5 gene (dga) (NCBI Accession number U12405) and to the E. coli murl gene which 
encodes Murl activity in that organism. To evaluate whether this H. pylori ORF 
encodes a protein with Murl activity, the gene was isolated by polymerase chain 
reaction (PCR) amplification cloning, overexpressed in E. coli, and the protein 
purified to apparent homogeneity. A simple assay for Murl activity resulting in the 
10 isomerization of D-glutamic acid to L-glutamic acid was developed to facilitate 
purification and for future use as a high-throughput drug screen. The ORF in H. 
pylori has been found by gene disruption studies to be essential for viability of H. 
pylori cells in laboratory culture. 

1 5 Cloning of H. pylori murl gene encoding Murl 

A 765 base pair DNA sequence encoding the murl gene of H. pylori was 
isolated by polymerase chain reaction (PCR) amplification cloning. A synthetic 
oligonucleotide primer (5 '-AAATAGTCATATGAAAATAGGCGTTTTTG -3' 
(SEQ ID NO: 35)) encoding an Ndel restriction site and the 5' terminus of the murl 

20 gene and a primer (5 ' - AG AATTCTATT AC AATTTGAGCC ATTCT -3 ' (SEQ ID 
NO: 36)) encoding an EcoRJ restriction site and the 3' end of the murl gene were 
used to amplify the murl gene of//, pylori using genomic DNA prepared from the 
J99 strain of H. pylori as the template DNA for the PCR amplification reactions. 
(Current Protocols in Molecular Biology, John Wiley and Sons, Inc., F. Ausubel et 

25 al., editors, 1994). To amplify a DNA sequence containing the wwr/gene, genomic 
DNA (25 nanograms) was introduced into each of two reaction vials containing 1 .0 
micromole of each synthetic oligonucleotide primer, 2.0 mM MgC^, 0.2 mM of 
each deoxynucleotide triphosphate (dATP, dGTP, dCTP and dTTP), and 1 .25 units 
of heat stable DNA polymerases (Amplitaq, Roche Molecular Systems. Inc., 

30 Branchburg, NJ, USA) in a final volume of 50 microliters. The following thermal 
cycling conditions were used to obtain amplified DNA products for the murl gene 
using a Perkin Elmer Cetus/ GeneAmp PCR System 9600 thermal cycler: 
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Conditions for amplification of H. pylori murl (SEQ ID NO: 1): 
Denaturation at 94°C for 2 minutes; 

2 cycles at 94°C for 15 seconds, 30°C for 30 seconds, and 72°C for 15 
5 seconds; 

23 cycles at 94°C for 15 seconds, 53°C for 30 seconds, and 72°C for 15 
seconds. 

Reactions were concluded at 72°C for 20 minutes. 

10 Upon completion of thermal cycling reactions, the amplified DNA was 

washed and purified using the Qiaquick Spin PCR purification kit (Qiagen, 
Gaithersburg, MD USA). The amplified DNA sample was subjected to digestion 
with the restriction-endonucleases, Ndel and EcoRI (New England Biolabs, Beverly, 
MA USA) (Current Protocols in Molecular Biology, Ibid), The DNA samples from 

15 each of two reaction mixtures were pooled and subjected to electrophoresis on a 
1.0% SeaPlaque (FMC BioProducts, Rockland, ME, USA) agarose gel. DNA was 
visualized by exposure to ethidium bromide and long wave UV irradiation. 
Amplified DNA encoding the H. pylori murl gene was isolated from agarose gel 
slices and purified using the Bio 101 GeneClean Kit protocol (Bio 101 Vista, CA 

20 USA). 

Cloning of H. pylori DNA sequences into the pET-23 prokaryotic expression vector: 

The pET-23b vector can be propagated in any E. coli K-12 strain, e.g., 
HMS174, HB101, JM109, DH5oc, etc., for the purpose of cloning or plasmid 

25 preparation. Hosts for expression include E. coli strains containing a chromosomal 
copy of the gene for 17 RNA polymerase. These hosts are lysogens of bacteriophage 
DE3, a lambda derivative that carries the lad gene, the lacUV5 promoter and the 
gene for T7 RNA polymerase. T7 RNA polymerase is induced by addition of 
isopropyl-B-D-thiogalactosidase (DPTG), and the T7 RNA polymerase transcribes 

30 any target plasmid such as pFT-28b, carrying its gene of interest. Strains used in our 
laboratory include: BL21(DE3) (Studier, F,W., Rosenberg, A.H., Dunn, J.J., and 
Dubendorff, LW. Meth. EnzymoL 185: 60-89, 1990). The pET-23b vector (Novagen, 
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Inc., Madison, WI, USA) was prepared for cloning by digestion with Ndel and 
EcoRl (Current Protocols in Molecular Biology, ibid). Following digestion, the 
amplified, agarose get-purified DNA fragment carrying the murl gene was cloned 
(Current Protocols in Molecular Biology, ibid) into the previously digested pET-23b 
5 expression vector. Products of the ligation reaction were then used to transform the 
BL21(DE3) strain of E. coli. 

Transformation of competent bacteria with recombinant plasmids: 

Competent bacteria, E. coli, strain BL21 or strain BL21(DE3), were 

10 transformed with recombinant pET23- murl expression plasmid carrying the cloned 
H. pylori sequence according to standard methods (Current Protocols in Molecular, 
ibid). Briefly, 1 microliter of ligation reaction was mixed with 50 microliters of 
electrocompetent cells and subjected to a high voltage pulse, after which, samples 
were incubated in 0.45 milliliters SOC medium (0.5% yeast extract, 2.0% tryptone, 

15 10 mM NaCl, 2.5 m KC1, lOmM MgCl 2 , 10mM MgS0 4 and 20mM glucose) at 37°C 
with shaking for 1 hour. Samples were then spread on LB agar plates containing 100 
microgram/ml ampicillin for growth overnight. Transformed colonies of BL21 were 
then picked and analyzed to evaluate cloned inserts as described below. 

20 Identification of recombinant pET expression plasmids carrying H. pylori 
sequences: 

Individual BL2I clones transformed with recombinant pET-23- murl were 
analyzed by PGR amplification of the cloned inserts using the same forward and 
reverse primers specific for each H. pylori sequence that were used in the original 
25 PCR amplification cloning reactions. Successful amplification verified the 

integration of the H. pylori sequences in the expression vector (Current Protocols in 
Molecular Biology, ibid). 

Isolation and Preparation of plasmid DNA from BL21 transformants: 
30 Colonies carrying pET-23- murl vectors were picked and incubated in 5 mis 

of LB broth plus 100 microgram/ml ampicillin overnight. The following day 



-93- 



ASZD-PO 1-007 

plasmid DNA was isolated and purified using the Qiagen plasmid purification 
protocol (Qiagen Inc., Chatsworth, CA, USA). 

Cloning and expression of the E. coli groE operon: 
5 It has been demonstrated that co-expression of the E, coli murl gene with the 

genes in the E. coli groE operon reduces the formation of insoluble inclusion bodies 
containing recombinant Murl (Ashiuchi, M, Yoshimura, 1., Kitamura, T., Kawata, 
Y., Nagai, J., Gorlatov, S., Esaki, N. and Soda, K. 1995, J. Biochem. 117: 495-498). 
The groE operon encodes two proteins, GroES (97 amino acids) and GroEL (548 
10 amino acids), which are molecular chaperones. Molecular chaperones cooperate to 
assist the folding of new polypeptide chains (F. Ulrich Haiti, 1996, Nature London 
381: 571-580). 

The 2210 bp DNA sequence encoding the groE operon of E. coli (NCBI 
Accession number X07850) was isolated by polymerase chain reaction (PCR) 
15 amplification cloning. A synthetic oligonucleotide primer (5'- 

GCGAATTCGATCAGAATTTTTTTTCT (SEQ ID NO: 37)) encoding an EcoRI 
restriction site and the 5' terminus of the groE operon containing the endogenous 
promoter region of the groE operon and a primer (5'- 

ATAAGTACTTGTGAATCTTATACTAG -3' (SEQ ID NO: 38)) encoding a Seal 
20 restriction site and the 3' end of the groEL gene contained in the groE operon were 
used to amplify the groE operon of E. coli using genomic DNA prepared from E. 
coli strain MG1655 as the template DNA for the PCR amplification reactions 
(Current Protocols in Molecular Biology, Ibid), to amplify a DNA sequence 
containing the E. coli groE operon genomic DNA (12.5 nanograms) was introduced 
25 into each of two reaction vials containing 0.5 micromoles of each synthetic 

oligonucleotide primer, 1.5 mM MgCl 2 , 0.2 mM each deoxynucleotide triphosphate 
(dATP, dGTP, dCTP and dTTP) and 2.6 units heat stable DNA polymerases 
(Expanded High Fidelity PCR System, Boehringer Mannheim, Indianapolis, 
Indiana) in a final volume of 50 microliters. The following thermal cycling 
30 conditions were used to obtain amplified DNA products for the groE operon using a 
Perkin Elmer Cetus/ GeneAmp PCR System 9600 thermal cycler: culture at a final 
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concentration of L0 mM. Cells were cultured overnight to induce gene expression 
of the H. pylori recombinant DNA constructions. 

After induction of gene expression with IPTG, bacteria were pelleted by 
centrifugation in a Sorvall RC-3B centrifuge at 3,000 x g for 20 minutes at 4°C. 
5 Pellets were re-suspended in 50 milliliters of cold lOmM Tris-HCl, pH 8.0, 0.1 M 
NaCl and 0.1 mM EDTA (STE buffer). Cells were then centrifiiged at 2000 x g for 
20 min at 4°C. Pellets were weighed (average wet weight 6 grams/liter) and 
processed to purify recombinant protein as described below. 

10 Purification of soluble Murl: 

All steps were carried out at 4°C. Cells were suspended in 4 volumes of lysis 
buffer (50mM Potassium phosphate, p 7.0, lOOmM NaCl, 2mM EDTA, 2mM 
EGTA, 10% glycerol, 10 mM D,L-glutamic acid, 0.1 % p-mercaptoethanol, 200 
micrograms/ml lysozyme, 1 mM PMSF, and 10 jag/ml each of leupeptin, aprotinin, 

15 pepstatin, L-l-chloro-3-(4-tosylamido]-7-amino-2-heptanone (TLCK), L-l-chloro-3- 
phenyl-2-butanone (TPCK), and soybean trypsin inhibitor, and ruptured by three 
passages through a small volume micro fluidizer (Model M-l 10S, Micro fluidics 
International Corporation, Newton, MA). The resultant homogenate was diluted 
with one volume of buffer A (10 mM Tris-HCl pH 7.0, 0.1 mM EGTA, 10 % 

20 glycerol, 1 mM D,L-Glutamic acid, 1 mM PMSF, 0.1% beta-mercaptoethanol), and 
0.1 % Brij-35, and centrifuged (100,000 x g, 1 h) to yield a clear supernatant (crude 
extract). 

After filtration through a 0.80-|um filter, the extract was loaded directly onto 
a 20 ml Q-Sepharose column pre-equilibrated in buffer A containing 100 mM NaCl 

25 and 0.02% Brij-35. The column was washed with 100 ml (5 bed volumes) of Buffer 
A containing 100 mM NaCl and 0.02% Brij-35, then developed with a 100-ml linear 
gradient of increasing NaCl (from 100 to 500 mM) in Buffer A. A band of M r = 
28,000 kD corresponding to Murl, the product of the recombinant H. pylori murl 
gene, eluted at a gradient concentration of approximately 200-280 mM NaCl. 

30 Individual column fractions were then characterized for Murl activity (see below for 
description of assay) and the protein profile of the fractions were analyzed on 12% 
acrylamide SDS-PAGE gels. 
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Fractions containing Murl were pooled, brought to 70% saturation with solid 
(NH 4 )2S0 4 , stirred for 20 minutes, and then centrifuged at 27,000 x g for 20 
minutes. The resulting pellet was re-suspended in lysis buffer to a final volume of 8 
ml and loaded directly onto a 350-ml column (2.2 x 92 cm) of Sephacryl S-100HR 
5 gel filtration medium equilibrated in buffer B (10 mM Hepes pH 7.5, 150 mM 
NaCl, 0.1 mM EGTA, 10% glycerol, 1 mM D,L-glutamic acid, 0.1 mM PMSF, 
0.1% beta-mercaptoethanol) and run at 30 ml/hour. Fractions found to contain a 
Murl activity were pooled, and 0.5 volume of buffer C (10 mM Tris pH 7.5, 0.1 mM 
EGTA, 10% glycerol, 1 mM D,L-glutamic acid, 0.1 mM PMSF, 0.1 % beta- 

10 mercaptoethanol) was added (to reduce the NaCl concentration to 100 mM) and 
loaded onto a MonoQ 10/10 high- pressure liquid chromatography column 
equilibrated in buffer C containing 100 mM NaCl. The column was washed with 2 
bed volumes of this buffer and developed with a 40 mL linear gradient of increasing 
NaCl (from 100 to 500). Murl eluted as a sharp peak at 310 mM NaCl. Fractions 

15 containing a Murl activity were pooled, concentrated by dialysis against storage 
buffer [50% glycerol, 10 mM 3-(N-morpholino-propanesulfonic acid (MOPS) pH 
7.0, 150 mM NaCl, 0.1 mM EGTA, 0.02% Brij-35, 1 mM dithiothreitol (DTT)], and 
stored at -20°C. 

20 Preparation of glutamate-free H. pylori Murl 

1 mL of 1 -25mg/ml H. pylori Murl stored in the presence of 1 -1 0 mM 
Glutamate and 10- 50% glycerol were dialyzed 3 times against 1L of 100 mM 
KH 2 P0 4 , 100 mM Na 2 S0 4 , 5 mM DTT and 1 mM EDTA, pH 8.2 for 3 hours at 4°C. 
25 No glutamate was observable for the dialyzed protein by 1D-NMR or by 

using L-glutamate dehydrogenase in a photometric enzyme assay. The specific 
activity of the glutamate free H. pylori Murl was unaltered (kc at =1.8/min ) 
compared to an enzyme stored in the presence of glutamate. 



30 



B. Assays for Murl activity 
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1. Conversion of D-glutamate to L-glutamate (single enzyme 
coupled assay): 

In this assay, the conversion of D-glutamic acid to L-glutamic acid is 
5 coupled to the conversion of L-glutamic acid and NAD+ by L-glutamate 

dehydrogenase to 2-oxoglutarate, ammonia. The production of NADH is measured 
as an increase of absorbance at 340 nm (the reduction of NAD+ to NADH) at 37°C. 
The standard assay mixture (adapted from Choi, S-Y„ Esaki, N., Yoshimura, T., and 
Soda, K. 1991, Protein Expression and Purification 2: 90-93) contained 10 mM 

10 Tris-HCI, pH 7.5, 5 mM NAD+, 5 Units/ml L-glutamate dehydrogenase, varying 
concentrations of the substrate D-glutamic Acid (0.063 mM to 250 mM and the 
purified recombinant H. pylori enzyme Murl (1 jag to 50 jig). The reaction was 
started by the addition of either the substrate D-glutamate or the recombinant Murl 
after a pre-incubation at 37°C for 5 minutes with all of the other assay ingredients. 

15 The change in absorbance at 340 nm was measured in a Spectra MAX 250. Initial 
velocities were derived from the initial slopes. 

2. Conversion of D-glutamate to L-glutamate (two enzyme coupled 

assay) 

20 The activity of Murl interconversion of the enantiomers of glutamic acid, can 

be measured using D-glutamic acid as substrate as described by the methods of 
Gallo and Knowles (Gallo, K.A. and Knowles, J.R., 1993, Biochemistry 32, 3981- 
3990). The assay originally was used to measure the Murl activity of Lactobacillus 
fermenti and can be adapted for the measurement of Murl activity of the H. pylori 

25 murl gene product isolated as a recombinant protein from E. coli. In this assay, the 
measurement of the activity of Murl is linked to an OD change in the visible range 
in a series of coupled reactions to the activities of L-glutamate dehydrogenase 
(reduction of NAD to NADH). Initial rates were determined by following the 
increase in absorbance at 340 nm in a reaction volume of 200 ml containing 50 mM 

30 Tris-HCI pH 7.8, 4% v/v glycerol, 10 mM NAD, 60 Units/ml L-glutamate 

dehydrogenase, and varying concentrations of either substrate (from 0.063 mM to 
250 mM D-glutamic acid) or purified enzyme (from 1 |ag to 50 |ng). After a pre- 
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incubation of all reagents except either the substrate (D-glutamate) or the enzyme 
(murl gene product) for a period of 5 minutes, reactions were initiated by adding the 
missing ingredient (i.e., the enzyme or the substrate, as required), and the increase in 
optical density at 500 nm was measured in a Microplate Spectophotometer System 
5 (Molecular Devices, Spectra MAX 250). Measurements were followed for 20 

minutes, and initial velocities were derived by calculating the maximum slope for 
the absorbance increases. 

3. Conversion of L-GIutamate to D-Glutamate (two enzyme coupled assay) 

10 

The activity of Murl interconversion of the enantiomers of glutamic acid was 
measured using L-Glutamic acid as a substrate. In this assay, the conversion of L- 
Glutamate to D-Glutamate was monitored spectrophotometrically through a two 
enzyme coupling system wherein, the production of D-glutamate is coupled to the 

1 5 incorporation of D-Glutamate into UDP-MurNAc- Ala-D-Glu by recombinant E. 
faecalis MurD, with concomitant hydrolysis of ATP to ADP and inorganic 
phosphate. The inorganic phosphate produced in this reaction is subsequently 
consumed by the enzyme purine-nucleoside phosphorylase (PNP) reacting with the 
chromogenic substrate 2-amino-6-mercapto-7-methylpurine ribonucleoside (MESG) 

20 which has a spectral band at 360nm as described in Webb, M.R., Proc. NatL Acad. 
Sci USA, 89: 4884-4887 (1992). Initial rates of reaction were determined by 
following the increase of absorbance at 360nm in a 100 |j,L reaction volume 
containing 85 \iL of reaction buffer (58 mM Tris, pH=8.0, 23.5 mM ammonium 
acetate, 23.5 mM magnesium acetate, 2.94 mM dithiothreitol, 2.94 mM adenosine 

25 triphosphate, 0.47 mM UDP-MurNAc-Ala, 0.47 mM MESG, 1.17 units./mL PNP, 

16 |ig/mL E. faecalis MurD, and 30nM Murl) and 15 |iL of L-Glutamate stock (0.05 
- 10 mM). The increase in optical density at 360nM was measured continuously in 
at 96 well microtitre plate in a Microplate Spectrophotometer System (Molecular 
Devices, SpectraMAX 250). Measurements were followed for 20min and initial 

30 velocities were derived by calculating the maximum slope for the absorbance 
increases. 
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Results 

Kinetic properties of recombinant H. pylori enzyme: 

Kinetic constant for recombinant Murl were estimated by assaying its 
5 activity at various concentrations of protein and D-glutamic acid as described above. 
Purified recombinant H. pylori Murl exhibits a Vmax of about 300 
nanomoles/min/mg protein (kcat = 8.6 min -1) and a Km of 100 micromolar for D- 
glutamate. Severe substrate inhibition was observed. Although the Vmax value is 
lower than that observed for highly purified Murl from some other bacterial species, 
10 the Km for D-glutamic acid is lower than that observed for the enzyme from most 
other species, resulting in a catalytic efficiency (kcat/Km) which is typical of 
purified preparation from E. coli and P. pentococcus. 

15 C. Crystallization methodology 

One method of crystallizing H. pylori Murl complexed with glutamate 
includes the steps of: (a) preparing a first solution comprising reducing agent, 
substrate, salt, bactericide, buffer pH 6.5-9.5, and Murl, wherein the reducing 

20 agent, substrate, salt, bactericide and buffer pH 6.5-9.5, is each present in sufficient 
concentration to inhibit oxidation of the solution, bind to the Murl, stabilize the 
protein and prevent aggregation, inhibit bacterial growth, and control the pH of the 
solution; (b) preparing a second solution comprising salt and buffer pH 6.5-9.5, 
wherein the salt and buffer pH 6.5-9.5 is each present in sufficient concentration to 

25 stabilize the protein and prevent aggregation, and control the pH of the solution; (c) 
combining the first solution and the second solution, thereby producing a 
combination; and (d) forming drops from the combination in a method of 
crystallization under conditions in which crystals of Murl are produced, whereby, 
crystals of Murl are produced. The first solution can additionally include glycerol in 

30 sufficient concentration to facilitate freezing of the crystals and stabilize the protein, 
and the second solution can additionally include a precipitant, an organic additive, 
and a reducing agent, each in sufficient concentration to sequester water and force 
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protein molecules out of solution, precipitate the protein and effect the dielectric of 
the solution, and inhibit oxidation of the solution, respectively. 

Batch, vapor diffusion, or dialysis methods of crystallization can be 
employed to crystallize MurL 
5 The need for a reducing agent can be relieved by performing the 

crystallization under anaerobic conditions, such as under oil or in an anaerobic box. 

Organic additives of the present invention include, for example, methanol, 
ethanol, or 2-methyl-2,4-pentanediol (MPD). 

One skilled in the art would recognize that a wide variety of well-known 
10 buffers (e.g., HEPES, Tris, MOPS, etc.) could be used to buffer the solutions at the 
proper pH. 

Examples of a bactericide is ethylene glycol bis (92-aminoethyl ether)- 
N,N,N',N'-tetraacetic acid (EGTA) or ethylenediaminetetraacetic acid (EDTA) or 
sodium azide. 

1 5 Salts in the second solution can be, for example, magnesium chloride, 

magnesium sulfate, magnesium foramate, lithium sulfate, lithium chloride, 
ammonium acetate, ammonium sulfate, lithium acetate, ammonium citrate, or 
lithium citrate. 

Reducing agents in the first or second solutions can be, for example, 
20 dithiothreitol (DTT), Tri(2-carboxyethyl)phosphine hydrochloride (TCEP), or beta- 
mercaptoethanol, to prevent, or retard oxidation of the protein. 

Precipitants in the second solution can range from about 0% to about 55%, 
polyethylene glycol (PEG), or a derivative thereof (e.g., mono-methyl-ether poly- 
ethylene glycol (MME PEG)). More specifically can range from about 5% to about 
25 40%, more specifically can range about 15% to about 30% PEG, or from about 20% 
to about 25% PEG. A variety of precipitants, such as PEG (e.g., PEG 500 to 20,000, 
or any intermediate PEG), or a derivative thereof, can be used. More specifically the 
precipitant in the second solution is PEG 1,000-10,000, or PEG 2,000-6,000. 

One crystal of H. p. Murl was made by the process of: (a) preparing a first 
30 solution of from about one to about 100 mM D,L-glutamic acid, from about 0.1 mM 
to about 5 mM reducing agent, about 0-30% glycerol, about 1-500 mM buffer pH 
6.5-9.5, about 50-500 mM salt, about 0.1 mM bactericide, and from about 1 mg/ml 
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to about 50 mg/ml H, pylori Murl; (b) preparing a second solution of about 50-150 
mM salt, about 0-55% precipitant, from about 0-15% organic additive, about 0-30% 
glycerol, about 5-200 mM buffer pH 6.5-9.5, and about 0-50 mM reducing agent; 
and (c) combining the first solution and the second solution into drops thereby 
5 producing a combination used in a method of crystallization. 

One embodiment of the present invention relates to selected crystals that 
were flash frozen in cold nitrogen stream and tested on in-house detector. 

One H. pylori Murl crystal of the present invention was made by the process 
of preparing a first solution of from about one to about 100 mM D,L-glutamic acid, 

10 from about 1 mM to about 5 mM reducing agent, from about 0 to about 30% 

glycerol, from bactericide 1 to about 100 mM buffer pH 6.5-9.5, from about 50 to 
about 500 mM salt, about 0.1 mM EGTA, and from about 1 mg/ml to about 50 
mg/ml H. pylori Murl; preparing a second solution of about 50-150 mM salt, from 
about 0 to about 55% precipitant, from about 0 to about 15% organic additive, from 

1 5 about 0 to about 30% glycerol, from about 5 to about 200 mM buffer pH 6.5-9.5, 
and from about 1 to about 10 mM reducing agent; combining the first solution and 
the second solution into drops thereby producing a combination; and forming drops 
from the combination in a method of crystallization under conditions in which 
crystals of Murl are produced, whereby crystals of Murl were produced. 

20 One crystallized complex of the present invention is characterized as 

belonging to the orthorhombic space group V2\2\2\ and having cell dimensions of a 
= 62.18 A, b = 81.07 A and c = 1 13.82 A, wherein a = 90°, p = 90°, and y = 90°, 
wherein the crystallized complex was made by the process of preparing a first 
solution of about 10 mM D,L-glutamic acid, about 1 mM DTT, about 10% glycerol, 

25 about 10 mM MOPS pH 7.0, about 50 mM NaCl, about 0. 1 mM EGTA, and about 7 
mg/ml H. pylori Murl; preparing a second solution of about 80 mM MgC^, about 
25% PEG 3350, from about 8% methanol, about 10% glycerol, about 100 mM Tris 
pH 8.0, and about 5 mM DTT; combining the first solution and the second solution, 
thereby producing a combination; and forming drops from the combination in a 

30 method of crystallization under conditions in which crystals of Murl were produced. 

Another crystallized complex of the present invention is characterized as 
belonging to the monoclinic space group P2j and having cell dimensions of a = 
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59.20 A, b = 82.40 A and c = 106.50 A, wherein a = 90°, (J = 92.15° ? and y = 90°, 
wherein the crystallized complex was made by the process of preparing a first 
solution of about 5 mM D,L-glutamic acid, about 1 mM TCEP, about 200 mM 
ammonium acetate pH 7.4, and about 10 mg/ml H. pylori Murl; preparing a second 
5 solution of about 0.2 mM MgCl 2 , about 20-25% PEG 4000, about 20% glycerol, 
about 100 mM Tris pH 8.5; combining the first solution and the second solution, 
thereby producing a combination; and forming drops from the combination in a 
method of crystallization under conditions in which crystals of Murl were produced. 
Another crystallized complex of the present invention is characterized as 

10 belonging to the monoclinic space group P2i and having cell dimensions of a = 
52.28 A, b = 78.96 A and c = 59.15 A, wherein a - 90°, p = 92.64°, and y = 90°, 
wherein the crystallized complex was made by the process of preparing a first 
solution of about 5 mM D,L-glutamic acid, about 1 mM TCEP, about 20-25% PEG 
4000, about 200 mM ammonium acetate pH 7.4, and about 10 mg/ml H. pylori 

15 Murl; preparing a second solution of about 200 mM MgS0 4 , about 100 mM Tris pH 
8.5, about 25% PEG 4000; combining the first solution and the second solution, 
thereby producing a combination; and forming drops from the combination in a 
method of crystallization under conditions in which crystals of Murl were produced. 
Another crystallized complex of the present invention is characterized as 

20 belonging to the monoclinic space group P2i and having cell dimensions of a = 
52.02 A, b = 80.66 A and c = 59.18 A, wherein a = 90°, p = 92.65°, and y = 90°, 
wherein the crystallized complex was made by the process of preparing a first 
solution of about 5 mM D,L-glutamic acid, about 1 mM TCEP, about 200 mM 
ammonium acetate pH 7.4, and about 10 mg/ml H. pylori Murl; preparing a second 

25 solution of about 200 mM sodium acetate, about 100 mM Tris pH 8.5, about 25% 
PEG 4000, and about 15% glycerol; combining the first solution and the second 
solution, thereby producing a combination; and forming drops from the combination 
in a method of crystallization under conditions in which crystals of Murl were 
produced. 

30 Another crystallized complex of the present invention is characterized as 

belonging to the monoclinic space group P2i and having cell dimensions of a = 
52.61 A, b = 78.40 A, and c = 59.43 A, and wherein a = 90°, P = 92.33°, y = 90°, 
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wherein the crystallized complex was made by the process of preparing a first 
solution of about 5 mM D,L-glutamic acid, about 1 mM TCEP, about 200 mM 
ammonium acetate pH 7.4, and about 10 mg/ml H. pylori Murl; preparing a second 
solution of about 200 mM LiS0 4 , about 100 mM Tris pH 8.5, and about 25% PEG 
5 3350; combining the first solution and the second solution, thereby producing a 

combination; and forming drops from the combination in a method of crystallization 
under conditions in which crystals of Murl were produced. 

One crystallized complex of the present invention is characterized as 
belonging to the orthorombic space group P2i2j2i and having cell dimensions of a = 

10 62.9 A, b = 80.8 A and c = 1 1 3.6 A, wherein a = 90°, p = 90°, and y = 90°, wherein 
the crystallized complex was made by the process of preparing a first solution of 
about 10 mM D,L-glutamic acid, about 1 mM DTT, about 10% glycerol, about 10 
mM MOPS pH 7.0, about 50 mM NaCl, about 0.1 mM EGTA, and about 7 mg/ml 
H. pylori Murl; preparing a second solution of about 80 mM MgCl 2 , about 25% 

15 PEG 3350, from about 8% methanol, about 10% glycerol, about 100 mM Tris pH 
8.0, and about 5 mM DTT; combining the first solution and the second solution, 
thereby producing a combination; and forming drops from the combination in a 
method of crystallization under conditions in which crystals of Murl were produced. 
Crystals having the atomic coordinate of Figure 5 were obtained by vapor 

20 diffusion using the hanging drop method. ("Protein Crystallization", Terese M. 

Bergfors (Ed.), International University Line, pages 7-15, 1999.) Protein had been 
stored at -80 degrees at a concentration of 10 mg/ml in a buffer containing 200 mM 
ammonium acetate pH 7.4, 5 mM D,L-glutamate, and 1 mM TCEP. The reservoir 
solution typically contained 25-35% PEG 4000, 200 mM magnesium sulfate, 100 

25 mM Tris pH 8.0-9.0, and 1 mM TCEP. Hanging drops were set up by mixing 4 

microliters of protein solution with either 2 or 4 microliters of the reservoir solution. 
Plate shaped crystals appear overnight and reached a typical size of 0.2 x 0.2 x 0.2- 
0.6 (mm 3 ) in a couple of days. Selected crystals were transferred into a modified 
reservoir solution now containing 20% glycerol for a few seconds and flash frozen 

30 in a cold nitrogen stream and tested on an in-house Mar345 detector (MarResearch 
Hamburg, Germany) prior to data collection at a synchrotron. 
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Crystallization of H. pylori Murl with glutamate and pyrimidinedione inhibitors. 

One embodiment of the present invention relates to a crystal oiH, pylori 
glutamate racemase complex ed with a pyrimidinedione having the orthorhombic 
space group P2i2i2i, and having cell dimensions a = 61.4 A, b = 76.3 A, c = 108.9 
5 A, and a = 90°, (3 = 90°, and y = 90°. A further embodiment encompasses this 
crystal made by the process of preparing a first solution of about 5 mM D-L 
Glutamate; about 1 mM TCEP; about 200 mM ammonium acetate; about 0.1 M Tris 
pH 7.4-8.5; about 500 micromolar of compound A; and about 10 mg/ml glutamate 
racemase from H. pylori; preparing a second solution with about 0.1 M Tris pH 7.4- 
10 8.5; about 20-25% PEG 3350; about 15-25% glycerol; and about 0.2 mM 

ammonium acetate; combining the first solution and the second solution, thereby 
producing a combination; and forming drops from the combination in a method of 
crystallization under conditions in which crystals of glutamate racemase are 
produced. 

15 One embodiment of the present invention is crystallized H. pylori glutamate 

racemase complexed with a pyrimidinedione having the orthorhombic space group 
P2]2i2, and having cell dimensions a = 60.7 A, b = 77.5 A, c = 56.6 A, and a = 90°, 
P = 90°, and y = 90°. A further embodiment encompasses this crystal made by the 
process of preparing a first solution of about 5 mM D-L Glutamate; about 1 mM 

20 TCEP; about 200 mM ammonium acetate; about 0.1 M Tris pH 7.4-8.5; about 500 
micromolar of compound A; and about 10 mg/ml glutamate racemase from H. 
pylori; preparing a second solution with about 0.1 M Tris pH 7.4-8.5; about 20-25% 
PEG 3350; about 15-25% glycerol; and about 0.2 mM ammonium acetate; 
combining the first solution and the second solution, thereby producing a 

25 combination; and forming drops from the combination in a method of crystallization 
under conditions in which crystals of glutamate racemase are produced. 

Another embodiment of the present invention is crystallized H. pylori 
glutamate racemase complexed with a pyrimidinedione having the monoclinic space 
group P2i, and having cell dimensions a = 57.1 A, b = 78.0 A, c = 58.55 A, and a = 

30 90°, P = 97.91°, and y = 90°. A further embodiment encompasses this crystal made 
by the process of preparing a first solution of about 5 mM D-L Glutamate; about 1 
mM TCEP; about 200 mM ammonium acetate; about 0.1 M Tris pH 7.4-8.5; about 
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500 micromolar of compound A; and about 10 mg/ml glutamate racemase from H. 
pylori; preparing a second solution with about 0.1 M Tris pH 7.4-8.5; about 20-25% 
PEG 3350; about 15-25% glycerol; and about 0.2 mM ammonium acetate; 
combining the first solution and the second solution, thereby producing a 
5 combination; and forming drops from the combination in a method of crystallization 
under conditions in which crystals of glutamate racemase are produced. 

D. Data Collection 

10 Crystals diffracted to about 2.8 A resolution using in-house X-ray source 

(MarResearch 345 mm image-plate detector system with X-ray generated on Rigaku 
RU300HB rotating anode operated at 50 kV and 100m A). A complete data set was 
collected at ID2, ESRF to 2.3 A resolution. MAD (MAD = Multiwavelength 
Anomalous Diffraction phasing methods) (Se-Met) data were collected at BM14, 

15 ESRF. Three data sets were collected at three different wavelengths (0.91833 A, 
0.97804 A, and 0.97821 A). The data were processed using Denzo (Z. Otwinowski 
and W. Minor, "Processing of X-ray Diffraction Data Collected in Oscillation 
Mode", Methods in Enzymology, Volume 276: Macromolecular Crystallography, 
part A, p.307-326, 1997,C.W. Carter, Jr. & R. M. Sweet, Eds., Academic Press, 

20 New York). Statistics of the MAD data collection are shown in Table 6. 

A high resolution data set with a diffraction of 1 .9 A resolution was collected 
from a crystal grown in the presence of 0.1 M Tris pH 8.5, 0.2 M MgSC>4, and 25% 
PEG 4000 that belonged to space group P2i (a = 52.28 A, b = 78.96 A, c = 59.14 A, 
a = 90°, p = 92.64°, and y = 90°) at beam line 71 1 at MaxLab. The data was 

25 processed, scaled and merged using MOSFLM, SCALA, and TRUNCATE (The 
CCP4 Suite: "Programs for Protein Crystallography", Acta Cryst. D50: 760-763). 
The data set comprised 144510 measurements of 34987 unique reflections giving a 
multiplicity of 4.1, a completeness of 92.3% and an overall R-merge of 8.1%. 
Additional cycles of refinement with the program CNS and model building with the 

30 program ONO gave a final model consisting of two polypeptide chains of 255 amino 
acid residues (residues 1-255), two D-glutamic acids and 377 ordered water 
molecules. The final R-values were R = 0.2061 and R-free = 0.2458. 
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One crystal belongs to the orthorhombic space group P2\2\2\ and has cell 
dimensions of a = 62. 1 8 A, b = 8 1 .07 A and c = 1 1 3 .82 A, wherein a = 90°, (5 = 90°, 
and y = 90°. One crystal belongs to the to the monoclinic space group P2i and has 
cell dimensions of a = 59.20 A, b = 82.40 A and c = 106.50 A, wherein a = 90°, p = 
5 92.15°, and y = 90°. One crystal belongs to the monoclinic space group P2i and has 
cell dimensions of a = 59.67 A, b = 78.82 A and c = 59.34 A, wherein a = 90°, p = 
92.35°, and y = 90°. One crystal belongs to the monoclinic space group P2i and has 
cell dimensions of a = 52.02 A, b = 80.66 A and c = 59.18 A, wherein a = 90°, p = 
92.65°, and y = 90°. One crystal belongs to the monoclinic space group P2j and has 
10 cell dimensions of a = 52.48 A, b = 80.71 A, and c = 59.42 A, and wherein a = 90°, 
p = 91.68°, y = 90°. Each of these space groups are encompassed by the structural 
coordinates of Figure 6. 

Table 6: MAD Data Collected at ESRF.BM14; Resolution 2.8 A (CCD 
15 detector/DENZO) 



Dataset 


Wave-length 
(A) 


N me as 


N ref 


%poss 


Multiplicity 


Rfac 


Ranom 


PK 


0.97805 


47166 


13965 


98.6 


3.4 


0.048 


0.052 


PI 


0.97821 


49507 


14015 


98.9 


3.5 


0.045 


0.036 


RM 


0.91834 


47118 


13976 


98.6 


3.4 


0.047 


0.032 



Rfac=Sum|<I>-Ij|/Sum |Ij| 

Ranom= Sum|<I+>-<I->|/ Sum|<I+>-<I->| 



20 E. Phase Determination by MAD 

The selenium sites were found by the program RANT AN (The CCP4 Suite: 
Programs for Protein Crystallography". Acta. Cryst. D50, 760-763, Yao Jia-xing, 
(1981). Acta. Cryst. A37, 642-644) and verified by difference Fourier. 
25 Heavy atom refinement and phasing were carried out with MLPHARE (The 

CCP4 Suite: "Programs for Protein Crystallography". Acta. Cryst. D50, 760-763 
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Z.Otwinowski: Daresbury Study Weekend proceedings, 1991). Clear protein and 
solvent boundary was observed after solvent flattening. Phasing statistics using 
MLPHARE are depicted in Table 7. 

Further heavy atom refinement was carried out using the program SHARP 
5 (de la Fortelle, E. and Bricogne, G. (1997)). Maximum likelihood heavy-atom 
parameter refinement for multiple isomorphous replacement and multiwavelength 
anomalous diffraction methods. In Carter, C.W. and Sweet, R.M. (ed.) Methods in 
Enzymology vol. 27(5, Academic Press, Orlando, Fl: pp. 472-494, 1997), which 
resulted in a slightly improved electron density map. 

10 It was already known that the protein exists as a homo-dimer but inspection 

of the electron density map identified additional molecular symmetry. Each 
monomer has a pseudo two-fold symmetry that divides the monomer into two 
domains with very similar alpha/beta type folds. The binding site is found between 
the two domains. Thus, the active form of the protein is a four-domain structure, 

1 5 with two seemingly independent binding sites. 

Phase improvement using two-fold density averaging (DM, The CCP4 Suite: 
"Programs for Protein Crystallography". Acta. Cryst. D50, 760-763, 1995) yielded 
an electron density map with much better quality. The polypeptide chain was easily 
traced using the improved map and guided by the 8 Seleno-methionine sites. More 

20 than 80% of the amino acid sequence was traced with relatively high confidence. 
The co-location of the two proposed active-site cysteine residues (Cys70 and 
Cysl81, Ca-Cot distance 7.5 A) validated the chain tracing. Further analysis of this 
tentative active site, confirmed that indeed a high number of conserved residues 
mapped into the same site. In addition, a piece of significant electron density was 

25 found that did not belong to any of the protein side chains. This density was 
interpreted and confirmed as arising from the bound substrate, D-glutamate. 



Table 7: Phasing statistics using MLPHARE (3.0 A) 



Dataset 


Wave-length (A) 


R_Cul 
acen 


RhPow 
acen 


R_Cul 
cent 


RhPow 
cent 


RCulan 
om 


PI 

(native) 


0.97821 










0.86 
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RM 


0.91834 


0.84 


1.12 


0.70 


0.85 


0.84 


PK 


0.97805 


0.97 


0.40 


0.96 


0.27 


0.75 



FOM 0.4470 (acent: 0.4479, cent: 0.4417) 

Phase improved by two-fold density averaging (DM) 



F. Refinement of the Crystal Structure 

5 

A first round of refinement using REFMAC (The CCP4 Suite: Programs for 
Protein Crystallography". Acta Cryst, D50: 760-763, "Refinement of 
Macromolecular Structures by the Maximum-Likelihood Method". G.N. 
Murshudov, A.A.Vagin and E.J.Dodson, (1997) in Acta Cryst. D53: 240-255) after 

10 completing model building for protein atoms yielded an R-value of 0.26 and an R- 
free value of 0.37. The R-value describes the discrepancy between the observed 
data and synthetic data calculated from the model. The R-free is the same calculated 
from a test set of reflections, usually 4% of total, that one sets aside at the beginning 
of the refinement and serves as an unbiased reference to avoid over- fitting of the 

1 5 data. The R-value is resolution dependent but should be below 25 and the Rfree 
normally not more than 5% higher. 

At this point of the refinement a high resolution native data set was 
introduced. The data was not completely isomorphous so molecular replacement 
(MR) was performed using the model refined against the Multiwavelength 

20 Anomalous Diffraction (MAD) data. MR was performed using AMORE (The 

CCP4 Suite: Programs for Protein Crystallography", Acta. Cryst. D50: 760-763, J. 
Navaza, Acta Cryst. A50: 157-163 (1994) on the earlier native data (2.3 A 
resolution), and resulted in a clear solution. Rigid body and NCS restrained 
refinement using REFMAC was carried out. D-glutamate was built in the active site 

25 and crystallographic R/Rfree of 0.27/0.34 was obtained. 

In the later stage of refinement, simulated annealing with torsion angle 
dynamics was employed using program the CNS (Crystallography & NMR System, 
Acta. Cryst. D54: 905-921 (1998)), which yielded a lower R-free value and an 
improved electron density map. 

30 The refinement statistics for CNS are as follows: 
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Refinement resolution: 50-2.3 A; 
Final R= 0.2283 R-free = 0.2808; 
Total number of reflections used: 22893 (87.2%); 
Number of reflections in working set: 21719 (82.7%); and 
5 Number of reflections in test set: 1 1 74 (4.5%). 

G. Nuclear magnetic resonance (NMR) of H. pylori Murl 

Nuclear magnetic resonance (NMR) provides a method by which the 
10 structure and conformation of the amino acid residues of the protein can be 

visualized in solution. NMR experiments were performed at 303K on a Bruker 
Avance 800 MHz system equipped with a triple-resonance ( l H/ l3 a ls N) single- 
gradient 5-mm probe. Samples were at pH 7.5 in a 100 mM phosphate buffer 
containing 100 mM K 2 S0 4 , 5 mM DTT and 1 mM EDTA. Protein concentration 
15 was 0.3 mM. [ 15 N, ! H] correlation experiments were recorded to follow proton 

stability with increasing D-glutamate concentration. Evolution times were around 
90 minutes in the protein dimensions, and 30 minutes in the nitrogen dimension, 
with a total acquisition time of 2.5 hours. Data sets were processed and analyzed 
with the program nmrPipe (Delaglio, F., Grzesiek, S., Vuister, G. 5 Zhu, G., Pfeifer, 
20 H., and Bax, A. "NMRPipe: a multidimensional spectral processing system based on 
UNIX Pipes", J. Biomol. NMR 6: 277-293 (1995)). 

Production of isotopically labeled Murl protein for NMR studies. 

25 Plate pre-growth and adaptation 

Fresh H. pylori J99 Murl transformants were grown to the size of small 
pinheads for 30-36 hours on plates prepared as follows: 

0.2 grams of agar in 19 mLs D 2 0 is melted by mild heating (e.g., 
microwave). The melted agar is cooled to around 60°C and 2 mL of BioExpress®- 
30 1000 media (U-D, 98%, U-15N, 96-99% for I5 N/ 2 D labeling or U-13C, 97-98%, U- 
15N, 96-99%, U-D, 98%, for 15 N/ 2 D/ ,3 C labeling; Cambridge Iostope Labs, 
Andover, MA, USA) and the respective antibodies were labeled. 



- 109 - 



ASZD-PO 1-007 

Cell growth and inoculation 

The colonies on the plate were suspended using 2 mL of culture media 
(prepared as described below) and the OD 60 o of the solution was determined. 1L of 
5 culture medium (prepared as described below) and was inoculated with a total of 0.1 
OD 600 . At OD 60 o = 0.5-0.8 (about 14-20 hours of growth time) H. pylori Murl 
protein production was induced using 0.5 mM IPTG. At the time of induction, 4 
mL/L of BioExpress®-1000 media (U-D, 98%, U-15N, 96-99% for l5 N/ 2 D labeling 
or U-13C, 97-98%, U-15N, 96-99%, U-D, 98%, for ,5 N/ 2 D/ 13 C labeling) were 
10 added. 

Culture medium preparation: M9 Minimal Media spiked with BioExpress®-! 000 
media (1L total volume) 

Na 2 HP0 4 (7.26g), KH 2 P0 4 (3.0g), NaCl (0.5g), NH 4 C1 (l.Og; 15 N labeled), 
15 10 ml MgS0 4 (lOOmM in D 2 0), 10 ml CaCl 2 (10 mM in D 2 0), 10 mL glucose (20% 
in D 2 0), 10 ml thiamine (0.1 % in D 2 0), 760 mL D 2 0, 16 mL BioExpress®-1000 
media (U-D, 98%, U-15N, 96-99% for 15 N/ 2 D labeling or U-13C, 97-98%, U-15N, 
96-99%, U-D, 98%, for 15 N/ 2 D/ 13 C labeling). 

The protein was purified and characterized as described above. 

20 

Example 2: Crystallization and Characterization of Murl fromi?. coli 

Cloning and characterization of Murl from E. coli has been described in 
25 detail in U.S. Provisional Application 60/435,167. 

Cloning, overexpression, purification, and biochemical characterization of E. 
coli Murl has been reported H.T. Ho et al. (Biochemistry 34: 2464-2470 (1995)). 

A. Crystallization conditions and space groups 

30 

One method of crystallizing E. coli Murl complexed with glutamate includes 
the steps of: (a) preparing a first solution comprising a reducing agent, a substrate, 
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an activator, a salt pH 6.5-9.5, and Murl, wherein the reducing agent, substrate, salt 
pH 6.5-9.5, is each present in sufficient concentration to inhibit oxidation of the 
solution, bind to the Murl, stabilize the protein and prevent aggregation while 
controlling the pH of the solution, respectively; (b) preparing a second solution 
5 comprising salt pH 4.5-9.5, wherein the salt pH 4.5-9.5 is present in sufficient 

concentration to stabilize the protein and prevent aggregation while controlling the 
pH of the solution; (c) combining the first solution and the second solution, thereby 
producing a combination; and (d) forming drops from the combination in a method 
of crystallization under conditions in which crystals of Murl are produced, whereby, 

1 0 crystals of Murl are produced. 

The first solution can additionally include glycerol in sufficient 
concentration to facilitate freezing of the crystals and stabilize the protein, and the 
second solution additionally comprises a precipitant, each in sufficient concentration 
to force the protein out of solution, and inhibit oxidation of the solution, 

1 5 respectively. 

Batch, vapor diffusion, or dialysis methods of crystallization can be 
employed. 

The need for a reducing agent can be relieved by performing the 
crystallization under anaerobic conditions, such as under oil or in an anaerobic box. 

20 Salts can be, for example, magnesium chloride, magnesium sulfate, 

magnesium foramate, lithium sulfate, lithium chloride, ammonium acetate, 
ammonium sulfate, lithium acetate, ammonium citrate, or lithium citrate. 

Reducing agents in the first or second solutions can be, for example, Tri(2- 
carboxyethyl)phosphine hydrochloride (TCEP), beta-mercaptoethanol, or 

25 dithiothreitol (DTT), to prevent or retard oxidation of the protein. 

In a further embodiment, the precipitant in the second solution is from about 
0% to about 55% polyethylene glycol (PEG), or a derivative thereof (e.g., mono- 
methyl-ether poly-ethylene glycol (MME PEG)). In a further embodiment, the 
precipitant in the second solution comprises from about 5% to about 40% PEG. In 

30 an additional embodiment, the precipitant in the second solution comprises from 

about 15% to about 30% PEG. In another embodiment, the precipitant in the second 
solution comprises from about 20% to about 25% PEG. 
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A variety of precipitants, such as PEG (e.g., PEG 500 to 20,000, or any 
intermediate PEG), or a derivative thereof, can be used. In a further embodiment, 
the precipitant in the second solution is PEG 1,000-10,000. Alternatively, the 
precipitant in the second solution is PEG 2,000-6,000. 
5 One crystal of E. coli Murl was made by the process of preparing a first 

solution of from about one to about 100 mM D,L-glutamic acid; from about 0.1 mM 
to about 5 mM reducing agent; from about 0 to about 30% glycerol; from about 0 to 
about 500 mM buffer pH 6.5-9.5; from about 50 to about 500 mM salt; about 0.6 
mM UDP-MurNAc-Ala; and from about 1 mg/ml to about 50 mg/ml E. coli Murl; 

10 preparing a second solution of from about 50 to about 500 mM salt pH 4.5-7.5; from 
about 0 to about 55% precipitant; and from about 0 to about 30% glycerol; and 
combining the first solution and the second solution, thereby producing a 
combination; and forming drops from the combination in a method of crystallization 
under conditions in which crystals of Murl were produced. 

15 Selected crystals that were flash frozen in cold nitrogen stream and tested on 

in-house detector. 

One crystal of E. coli was made by the process of preparing a first solution of 
from about one to about 100 mM D,L-glutamic acid; from about 0.1 mM to about 5 
mM reducing agent; from about 0 to about 30% glycerol; from about 50 to about 

20 500 mM salt pH 6.5-9.5; about 0.6 mM UDP-MurNAc-Ala; and from about 1 mg/ml 
to about 50 mg/ml E. coli Murl; preparing a second solution of from about 50 to 
about 500 mM salt pH 4.5-7.5; and from about 0 to about 55% precipitant; 
combining the first solution and the second solution, thereby producing a 
combination; and forming drops from the combination in a method of crystallization 

25 under conditions in which crystals of Murl are produced, whereby, crystals of Murl 
were produced. 

One embodiment of the present invention comprises making an E. coli Murl 
crystal by the process described herein, wherein the crystal is characterized by the 
structural coordinates of Figure 1 1 . One embodiment of the present invention 
30 comprises a method of preparing an E. coli Murl crystal complexed with glutamate 
by the process described herein, wherein the crystal is characterized by the structural 
coordinates of Figure 8. 
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One crystallized complex of the present invention is characterized as 
belonging to the orthorombic space group C222i and having cell dimensions of a = 
83.05 A, b = 1 12.82 A and c = 74.12 A, wherein a = 90°, p = 90°, and y = 90°, 
wherein the crystallized complex was produced by the process of preparing a first 
5 solution of about 5 mM D,L-glutamic acid; about 1 mM TCEP; about 200 mM 
ammonium acetate pH 7.4; about 10 mg/ml E. coli Murl; and about 0.6 mM of the 
activator, UDP-MurNAc-Ala; preparing a second solution of about 100 mM sodium 
acetate pH 4.5; about 5-10% MME 2000; about 30% glycerol; combining the first 
solution and the second solution, thereby producing a combination; and forming 

10 drops from the combination in a method of crystallization under conditions in which 
crystals of Murl were produced. 

One crystallized complex of the present invention is characterized as 
belonging to the monoclinic space group P2i and having cell dimensions of a = 
70.04 A, b = 74.13 A and c = 70.10 A, wherein a = 90°, p = 107.15°, and y = 90°, 

15 wherein the crystallized complex was produced by the process of preparing a first 
solution of about 5 mM D,L-glutamic acid; about 1 mM TCEP; about 200 mM 
ammonium acetate pH 7.4; about 10 mg/ml E. coli Murl; and about 0.6 mM of the 
activator UDP-MurNAc-Ala; preparing a second solution of about 100 mM sodium 
acetate pH 4.5; about 5-10% MME 2000; about 30% glycerol; combining the first 

20 solution and the second solution, thereby producing a combination; and forming 

drops from the combination in a method of crystallization under conditions in which 
crystals of Murl were produced. 

Crystals were obtained by vapor diffusion using the hanging drop method. 
("Protein Crystallization" Terese M. Bergfors (Editor), International University 

25 Line, pages 7-15, 1999). Protein had been stored in 50 microliters aliquots at -80 
degrees at a concentration of 10 mg/ml in a buffer containing 200 mM ammonium 
acetate pH 7.4, 5 mM D,L-glutamic acid and 1 mM TCEP. Prior to crystallization, 
the activator, UDP-MurNAc-Ala was added to the solution at a final concentration 
of 0.6 mM. The reservoir solution typically contained 100 mM sodium acetate pH 

30 4.5, 25% polyethylene glycol mono-methyl ether (MME) 2000, and 30% glycerol. 
Drops were set up by mixing 2 microliters of protein solution with 2 microliters of 
reservoir solution. Plate-shaped crystals appeared over night and reached a typical 
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size of 0.4 x 03 x 0.2 (mm 3 ) in a week. Se-Met substituted Murl was produced and 
crystallized under similar conditions as for the wild-type protein. The substitution 
of the Se-Met was verified by Mass spectroscopy. Selected crystals were flash 
frozen in a cold nitrogen stream and tested on an in-house detector. The crystal 
5 belongs to the orthorombic space group C222i with cell dimensions of a = 83.05 A, 
b = 1 12.82 A and c = 74.12 A, wherein a = 90°, p = 90°, and y = 90°. Crystals can 
be flash-cooled to 95K directly from the drop without ice formation. 

B. Data Collection 

10 

Crystals diffracted to about 2.3 A resolution using in-house X-ray source 
(MarResearch 345 mm image-plate detector system with X-ray generated on Rigaku 
RU300HB rotating anode operated at 50 kV and 100m A). A complete data set was 
collected at ID 14.4 ESRF to 1.9 A resolution. MAD (MAD = Multiwavelength 

15 Anomalous Diffraction phasing methods) (Se-Met) data were collected at BM14, 
ESRF. Three data sets were collected at three different wavelengths (0.9786 A, 
0.9789 A, and 0.9184 A). 

The data were processed, scaled, and merged using MOSFLM, SCALA and 
TRUNCATE (The CCP4 Suite: "Programs for Protein Crystallography", Acta Cryst. 

20 D50: 760-763) and then scaled together using SCALEIT (The CCP4 Suite: 

"Programs for Protein Crystallography", Acta Cryst. D50: 760-763). Statistics of 
the MAD data collected are shown in Table 8. 

Table 8: MAD Data Collected at ESRF.BM14; Resolution 2.2 A (CCD 
25 detector/DENZO) 



Dataset 


Wave-length 

(A) 


Nmeas 


N ref 


%poss 


Multiplicity 


Rfac 


Ranom 


PK 


0.978 


119109 


17758 


99.5 


6.7 


5.1 


4.8 


PI 


0.9778 


120976 


17788 


99.5 


6.8 


5.2 


3.0 


RM 


0.9184 


101650 


18008 


100.0 


7.3 


7.3 


4.0 



R rac =Sum|<I>-Ij|/Sum |Ij| 
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Ranom= SuiTl|<I+>-<I->|/ Sum|<I+>-<I->| 



C. Phase Determination by MAD 



5 The crystal has one monomer per asymmetric unit and all 5 of the seleno- 

methionine sites could be easily found by the program SOLVE (Terwilliger, T.C. 
and J. Berendzen, "Automated MAD and MIR structure solution", Acta Cryst. D55: 
849-861 (1999)). The following statistics were obtained from the same program as 
shown in Table 9. 

10 

Table 9: Figure of Merit (FOM) with and estimates of lack-of-closure (LOC) 
residuals as a function of resolution from the solution found with SOLVE. 



DMIN: Total 


7.85 


4.98 


3.90 


3.31 


2.92 


2.65 


2.44 


2.27 


N: 17999 


911 


1500 


1919 


2216 


2506 


2755 


3000 


3192 


MEAN FOM: 0.62 


0.88 


0.90 


0.87 


0.82 


0.69 


0.58 


0.46 


0.28 


CENTRIC LOC: 


150.6 


34.5 


23.8 


20.7 


18.4 


18.9 


21.9 


39.4 


CORR ERR: 


1.4 


1.0 


0.6 


0.4 


0.8 


0.7 


0.4 


0.9 



The resulting electron density map could only marginally be improved by 
20 density modification as implemented in the program DM (The CCP4 Suite: 
"Programs for Protein Crystallography". Acta. Cryst. D50, 760-763 (1991)). 

The polypeptide chain was easily traced using the improved map and guided 
by the 5 Seleno-methionine sites. More than 90% of the amino acid sequence was 
traced with relatively high confidence. The co-location of the two proposed active- 
25 site cysteine residues (Cys82 and Cys204, Ca-Ca distance 7.6 A) validated the chain 
tracing. Further analysis of this tentative active site, confirmed that indeed a high 
number of conserved residues mapped into the same site. In addition, the electron 
density corresponding to the bound substrate, L-glutamate, was found between the 
two conserved cysteines. 

30 

D. Refinement of the Crystal Structure 
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A first round of refinement was performed using the program CNS 
(Crystallography & NMR System, Acta. Cryst. D54: 905-921 (1998)). After 
completing model building for protein atoms, an R-value of 0.24 and an R-free 
value of 0.27 was achieved. At this point of the refinement, a high resolution native 
5 data set to 1.9 A was introduced. Additional refinement gave a final model building 
consisting of a polypeptide chain of 265 amino acids (residues 20-285), one L- 
glutamate, one UDP-MurNAc-Ala, and 221 ordered water molecules. The final 
value was R = 0.2185 and R-free = 0.2440. 

10 

Example 3: Cloning and Characterization of Murl from E. faecalis and S. 

aureus* and E. faecium 

The genomes of E. faecalis and S. aureus, and E. faecium contain open 
1 5 reading frames (ORF) with homology to the Staphylococcus haemolyticus Murl 
gene (dga) (NCBI Accession number U12405) and to the E. coli Murl gene which 
encodes Murl activity in that organism. 

To evaluate whether these ORFs encode a protein with Murl activity, the 
gene was isolated by polymerase chain reaction (PCR) amplification cloning, 
20 overexpressed in E. coli, and the proteins purified to apparent homogeneity. A 

simple assay for Murl activity resulting in the isomerization of D-glutamic acid to 
L-glutamic acid was developed to facilitate purification and for future use as a high- 
throughput drug screen. 

25 Cloning of E. faecalis and S. aureus murl gene encoding Murl: 

An 822 base pair DNA sequence encoding the murl gene of E. faecalis was 
isolated by polymerase chain reaction (PCR) amplification cloning. A synthetic 
oligonucleotide primer (5 '-AAATAGTCATATGAAAATAGGCGTTTTTG -3' 
(SEQ ID NO: 67)) encoding an Ndel restriction site and the 5' terminus of the murl 

30 gene and a primer (5 ' - AG AATTCT ATT AC AATTTG AGCC ATTCT -3 ' (SEQ ID 
NO: 68)) encoding an EcoRI restriction site and the 3' end of the murl gene were 
used to amplify the Murl gene of E. faecalis using genomic DNA prepared from the 
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ATCC 29212 type strain of E.faecalis as the template DNA for the PCR 
amplification reactions. (Current Protocols in Molecular Biology, John Wiley and 
Sons, Inc., F. Ausubel et al., editors, 1994). 

An 801 base pair DNA sequence encoding the murl gene of S. aureus was 
5 isolated by polymerase chain reaction (PCR) amplification cloning. A synthetic 
oligonucleotide primer (5 '-AAATAGTC ATATGAAAATAGGCGTTTTTG -3' 
(SEQ ID NO: 67)) encoding an Ndel restriction site and the 5' terminus of the murl 
gene and a primer (5 ' - AG AATTCT ATT AC AATTTG AGCC ATTCT -3' (SEQ ID 
NO: 68)) encoding an EcoRI restriction site and the 3' end of the murl gene were 

10 used to amplify the Murl gene of S. aureus using genomic DNA prepared from the 
ATCC 25923 type strain of S. aureus as the template DNA for the PCR 
amplification reactions. (Current Protocols in Molecular Biology, John Wiley and 
Sons, Inc., F. Ausubel et al., editors, 1994). 

A 913 base pair DNA sequence encoding the rawr/gene of E.faecium was 

15 isolated by polymerase chain reaction (PCR) amplification cloning. A synthetic 
oligonucleotide primer (5 '-AAATAGTCAT ATGAAAATAGGCGTTTTTG -3' 
(SEQ ID NO: 67)) encoding an Ndel restriction site and the 5' terminus of the murl 
gene and a primer (5 '-AGAATTCTATT AC AATTTG AGCC ATTCT -3' (SEQ ID 
NO: 68)) encoding an EcoRI restriction site and the 3' end of the murl gene were 

20 used to amplify the Murl gene of E.faecium using genomic DNA prepared from the 
ATCC 19434 type strain of E.faecium as the template DNA for the PCR 
amplification reactions. (Current Protocols in Molecular Biology, John Wiley and 
Sons, Inc., F. Ausubel et al., editors, 1994). 

To amplify a DNA sequence containing the murl gene, genomic DNA (25 

25 nanograms) was introduced into each of two reaction vials containing 1.0 micromole 
of each synthetic oligonucleotide primer, 2.0 mM MgCb, 0.2 mM of each 
deoxynucleotide triphosphate (dATP, dGTP, dCTP and dTTP), and 1.25 units of 
heat stable DNA polymerases (Amplitaq, Roche Molecular Systems. Inc., 
Branchburg, NJ, USA) in a final volume of 50 microliters. The following thermal 

30 cycling conditions were used to obtain amplified DNA products for the Murl gene 
using a Perkin Elmer Cetus/ GeneAmp PCR System 2400 thermal cycler: 
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Conditions for amplification of E. faecalis and S. aureus murl: 
Denaturation at 94°C for 2 minutes; 

30 cycles at 94°C for 10 seconds, 55°C for 30 seconds, and 72°C for 90 
seconds; 

5 Reactions were concluded at 72°C for 7 minutes. 

Conditions for amplification of E. faecium murl: 
Denaturation at 94°C for 5 minutes; 

25 cycles at 94°C for 30 seconds, 55°C for 30 seconds, and 72°C for 60 
10 seconds; 

Reactions were concluded at 72°C for 7 minutes. 



Upon completion of thermal cycling reactions, the amplified DNA was 
washed and purified using the Qiaquick Spin PCR purification kit (Qiagen, 

15 Gaithersburg, MD USA). The amplified DNA sample was subjected to digestion 

with the restriction-endonucleases, Ndel and EcoRI (New England Biolabs, Beverly, 
MA USA) (Current Protocols in Molecular Biology, Ibid). The DNA samples from 
each of two reaction mixtures were pooled and subjected to electrophoresis on a 
1.0% SeaPlaque (FMC BioProducts, Rockland, ME, USA) agarose gel. DNA was 

20 visualized by exposure to ethidium bromide and long wave UV irradiation. 

Amplified DNA encoding the H. pylori Murl gene was isolated from agarose gel 
slices and purified using the Bio 101 GeneClean Kit protocol (Bio 101 Vista, CA 
USA). 



25 Cloning of E.faecalis, S. aureus, and E. faecium murl DNA sequences into the pET- 
23 prokaryotic expression vector: 

The pET-23b vector can be propagated in any E. coli K-12 strain, e.g., 
HMS174, HB101, JM109, DH5a, etc., for the purpose of cloning or plasmid 
preparation. Hosts for expression include E. coli strains containing a chromosomal 

30 copy of the gene for 17 RNA polymerase. These hosts are lysogens of bacteriophage 
DE3, a lambda derivative that carries the lad gene, the lacUVS promoter and the 
gene for T7 RNA polymerase. T7 RNA polymerase is induced by addition of 
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isopropyl-B-D-thiogalactosidase (IPTG), and the T7 RNA polymerase transcribes 
any target plasmid such as pFT-28b, carrying its gene of interest. Strains used in our 
laboratory include: BL21(DE3) (Studier, F,W., Rosenberg, A.H., Dunn, J.J., and 
Dubendorff, LW. Meth. Enzymol. 185: 60-89, 1990). The pET-23b vector (Novagen, 
5 Inc., Madison, WI, USA) was prepared for cloning by digestion with Ndel and 
EcoRl (Current Protocols in Molecular Biology, ibid). Following digestion, the 
amplified, agarose get-purified DNA fragment carrying the murl gene was cloned 
(Current Protocols in Molecular Biology, ibid) into the previously digested pET-23b 
expression vector. Products of the ligation reaction were then used to transform the 
10 BL21(DE3) strain of E. coli. 

Transformation of competent bacteria with recombinant plasmids: 

Competent bacteria, E. coli, strain BL21 or strain BL21(DE3), were 
transformed with recombinant pET23- murl expression plasmid carrying the cloned 

15 E.faecalis or S. aureus sequence according to standard methods (Current Protocols 
in Molecular, ibid). Briefly, 1 microliter of ligation reaction was mixed with 50 
microliters of electrocompetent cells and subjected to a high voltage pulse, after 
which, samples were incubated in 0.45 milliliters SOC medium (0.5% yeast extract, 
2.0% tryptone, 10 mM NaCl, 2.5 m KC1, lOmM MgCl 2 , lOmM MgS0 4 and 20mM 

20 glucose) at 37°C with shaking for 1 hour. Samples were then spread on LB agar 

plates containing 1 00 microgram/ml ampicillin for growth overnight. Transformed 
colonies of BL21 were then picked and analyzed to evaluate cloned inserts as 
described below. 

25 Identification of recombinant pET expression plasmids carrying murl sequences: 
Individual BL2I clones transformed with recombinant pET-23- murl were 
analyzed by PCR amplification of the cloned inserts using the same forward and 
reverse primers specific for each sequence that were used in the original PCR 
amplification cloning reactions. Successful amplification verified the integration of 

30 the murl sequences in the expression vector (Current Protocols in Molecular 
Biology, ibid). 
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Isolation and Preparation of plasmid DNA from BL21 transformants: 

Colonies carrying pET-23- murl vectors were picked and incubated in 5 mis 
of LB broth plus 100 microgram/ml ampicillin overnight. The following day 
plasmid DNA was isolated and purified using the Qiagen plasmid purification 
5 protocol (Qiagen Inc., Chatsworth, CA, USA). 

Cloning and expression of the E. coli groE operon: 

It has been demonstrated that co-expression of the E. coli murl gene with the 
genes in the E. coli groE operon reduces the formation of insoluble inclusion bodies 

10 containing recombinant Murl (Ashiuchi, M., Yoshimura, 1., Kitamura, T., Kawata, 
Y., Nagai, J., Gorlatov, S., Esaki, N. and Soda, K. 1995, J. Biochem. 117: 495-498). 
The groE operon encodes two proteins, GroES (97 amino acids) and GroEL (548 
amino acids), which are molecular chaperones. Molecular chaperones cooperate to 
assist the folding of new polypeptide chains (F. Ulrich Hartl, 1996, Nature London 

15 381: 571-580). 

The 2210 bp DNA sequence encoding the groE operon of E. coli (NCBI 
Accession number X07850) was isolated by polymerase chain reaction (PCR) 
amplification cloning. A synthetic oligonucleotide primer (5'- 
GCGAATTCGATCAGAATTTTTTTTCT (SEQ ID NO: 69)) encoding an EcoRI 

20 restriction site and the 5 ' terminus of the groE operon containing the endogenous 
promoter region of the groE operon and a primer (5'- 

ATAAGTACTTGTGAATCTTATACTAG -3' (SEQ ID NO: 70)) encoding a Seal 
restriction site and the 3 ' end of the groEL gene contained in the groE operon were 
used to amplify the groE operon of E. coli using genomic DNA prepared from E. 

25 coli strain MG1655 as the template DNA for the PCR amplification reactions 
(Current Protocols in Molecular Biology, Ibid), to amplify a DNA sequence 
containing the E. coli groE operon genomic DNA (12.5 nanograms) was introduced 
into each of two reaction vials containing 0.5 micromoles of each synthetic 
oligonucleotide primer, 1.5 mM MgC^, 0.2 mM each deoxynucleotide triphosphate 

30 (dATP, dGTP, dCTP and dTTP) and 2.6 units heat stable DNA polymerases 
(Expanded High Fidelity PCR System, Boehringer Mannheim, Indianapolis, 
Indiana) in a final volume of 50 microliters. The following thermal cycling 
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conditions were used to obtain amplified DNA products for the groE operon using a 
Perkin Elmer Cetus/ GeneAmp PCR System 9600 thermal cycler: culture at a final 
concentration of 1 .0 mM. Cells were cultured overnight to induce gene expression 
of the H. pylori recombinant DNA constructions. 
5 After induction of gene expression with EPTG, bacteria were pelleted by 

centrifugation in a Sorvall RC-3B centrifuge at 3,000 x g for 20 minutes at 4°C. 
Pellets were re-suspended in 50 milliliters of cold lOmM Tris-HCl, pH 8.0, 0.1 M 
NaCl and 0.1 mM EDTA (STE buffer). Cells were then centrifuged at 2000 x g for 
20 min at 4°C. Pellets were weighed (average wet weight 6 grams/liter) and 
10 processed to purify recombinant protein as described below. 

Purification procedure for recombinant Murl proteins: 

The cell pellet form 2L culture was suspended in 50 mis lysis buffer (50 mM 
Tris pH 8, 150 mM NaCl, 1 mM DL glutamate, 0.1 mM phenylmethanesulfonyl 

15 fluoride) and lysed by French Press (10,000-15,000 psi). The lysates was 

centrifuged at 4°C (10,000 x g, 30 min) and the supernatant was loaded on a 5 mL 
HiTrap chelating agarose column (Amersham, Piscutaway, MH, USA) charged with 
NiS0 4 . The column was washed with 50 mL column buffer 19100 mM Tris pH 9, 
300 mM NaCl, 2 mM DL glutamate). Bound protein was eluted using a step 

20 gradient of imadazole concentration in column buffer and pure (>95%) Murl protein 
eluted at 100-300 mM imadazole. Fractions containing Murl were brought to 1 mM 
dithio-DL-threitol (DTT), concentrated and dialyzed (4x, 2-4L each) with dialysis 
buffer (10 mM Tris pH 8, 0.1 mM ethylene glycol-bis-(2-aminoethylether- 
N,N,N',N'-tetraacetic acid (EGTA), 1 mM DL glutamate, 150 mM NaCl, 1 mM 

25 tricarboxyethyl phosphine (TCEP)). The final dialysis buffer contained glycerol 
(10-50% wt/vol) for storage at -80°C. 

Amplification of internal fragments of the murl gene from 9 Enterococcal species. 
In order to obtain nucleotide sequence from the murl genes from other 
30 Enterococcal species, it was necessary to amplify and de novo sequence DNA as 
genome sequences have not been elucidated for these species. The approach used 
was to design synthetic oligonucleotide primers based on the E.faecalis murl 
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sequence and use the possibility of nucleotide homology between species of the 
same genus to amplify murl from taxonomically related Enterococcus species. 

360 base pair fragments of the murl coding sequence from 9 species from the 
genus Enterococcus were generated by polymerase chain reaction (PCR) 
5 amplification cloning. Two synthetic oligonucleotide primers derived from the E. 
faecalis murl gene (5 '-AAAATGCTAGTAATCGCATGTAATACCGC-3 ' (SEQ ID 
NO: 71) and (5 '-TGGGTACAACCTAAAATCAACGTATC-3 ' (SEQ ID NO: 72) 
were used to amplify the murl gene fragment from E, saccharolyticus (ATCC 
43076), E. mundtii (ATCC 43186), E. casseliflavus (ATCC 25788), E.favescens 

1 0 (ATCC 49996), E. cecorum (ATCC 43 1 98), E. raffinosus (ATCC 49427), E. 
malodoratus (ATCC 43197), E. solitarus (ATCC 49428), and E. hirae (ATCC 
48043) using genomic DNA as the template for the PCR amplification reactions. 
(Current Protocols in Molecular Biology, John Wiley and Sons, Inc., R. Ausubel et 
al., eds., 1994). To amplify a DNA sequence containing the murl gene, genomic 

15 DNA (50 nanograms) was introduced into a reaction vial containing 1.0 micromole 
of each synthetic oligonucleotide primer, 2.0 mM MgC12, 25 microliters of High 
Fidelity Platinum PCR Supermix (Invitrogen, Carlsbad, CA 92008, USA) to a final 
volume of 50 microliters. The following thermal cycling conditions were used to 
obtain amplified DNA products for the murl gene using a Perkin Elmer 

20 Cetus/GeneAmp PCR System 2400 thermal cycler: 



Conditions for amplification of Enterococcal murl fragments: 
Denaturation at 94°C for 5 minutes; 

30 cycles at 94°C for 30 seconds, 50°C for 30 seconds, and 72°C for 30 
25 seconds. 

Reactions were concluded at 72°C for 30 minutes. 



Upon completion of thermal cycling reactions, the amplified DNA was 
washed and purified using the Qiaquick Spin PCR purification kit (Qiagen, 
30 Gaitherburg, MD, 20876, USA). Purified fragments were sequenced using the 

BigDye Terminator Cycle Sequencing Ready Reaction kit (PE Biosystems, Foster 
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City, CA, 94404, USA) and reactions run on an ABI 3100 sequencer (Applied 
Biosystems, Inc., Foster City, CA 94404, USA). 

5 Example 4: Crystallization and Characterization of Murl from is. Faecalis 

Cloning and purification of Murl from E. faecalis has been previously 
described in U.S. Provisional Application 60/435,272. 

10 A. Crystallization of Murl from E. faecalis 

Crystals were obtained by vapor diffusion using the hanging drop method. 
("Protein Crystallization" Terese M. Bergfors (Editor), International University 
Line, pages 7-15, 1999). 
15 Protein had been stored in 50 microliters aliquots at -80 degrees at a 

concentration of 10 mg/ml in a buffer containing 0.2 M ammonium acetate pH 7.4; 5 
mM D-,L-glutamic acid and 1 mM Tri(2-carboxyethyl)phosphine hydrochloride 
(TCEP). 

The reservoir solution typically contained 200 mM magnesium chloride, 100 

20 mM Tris pH 7.5, and 20-25% PEG 4000. 

Drops were set up by mixing 2 microliters of protein solution with 2 
microliters of reservoir solution. Plate-shaped crystals appeared over night and 
reached a typical size of 0.2 x 0.2 x 0.2 mm, in a week. SE-Met substituted Murl 
was produced and crystallized under similar conditions as for the wild-type (wt) 

25 protein. The substitution of the Se-Met was verified by Mass spectroscopy. 

Selected crystals were incubated for 5 seconds in a solution containing 100 mM Tris 
pH 7.5, 200 mM ammonium sulphate, 25% PEG 4000 and 25% glycerol and then 
flash frozen in a cold nitrogen stream and tested on an in-house detector. 



30 



B. Data Collection 
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Crystals diffracted to about 2.2 A resolution using in-house X-ray source 
(MarResearch 345 mm image-plate detector system with X-ray generated on Rigaku 
RU300HB rotating anode operated at 50 kV and 100m A). MAD (MAD = 
Multiwavelength Anomalous Diffraction phasing methods) (Se-Met) data were 
5 collected at BM14 at ESRF on a crystal grown from Se-Met modified material. 
Three data sets were collected at three different wavelengths ( 0.9786 A, 0.8789 A, 
and 0.9184 A). The data were processed, scaled and merged using the programs 
MOSFLM, SCALA, TRUNCATE and SCALEIT (The CCP4 Suite: Programs for 
Protein Crystallography. Acta. Cryst. D50, 760-763, Yao Jia-xing, (1981). Acta. 
10 Cryst. A37, 642-644). Statistics of the MAD data collection are shown in Table 10. 

The crystal belongs to the orthorombic space group P2i2i2i and has cell 
dimensions of a = 60.29 A, b = 82.08 A and c = 1 15.57 A, wherein a = 90°, p = 90°, 
and y = 90°. This space group is encompassed by the structural coordinates of 
Figure 13. 

15 

Table 10: MAD Data Collected at ESRF.BM14; Resolution 2.2 A (CCD 
detector/MOSFLM) 



Dataset 


Wave-length 
(A) 


Nmeas 


N ref 


%poss 


Multiplicity 


Rfac 


Ranom 


PK 


0.9786 


64292 


14241 


93.7 


4.3 


9.0 


7.2 


PI 


0.8789 


69592 


14463 


94.1 


4.5 


9.4 


5.3 


RM 


0.9184 


81414 


15837 


95.6 


5.0 


10.5 


6.2 



R fac =Sum|<I>-Ij|/Sum |Ij| 
20 R an0 m= Sum|<I+>-<!->|/ Sum|<I+>-<I->| 
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C. Phase Determination by MAD 

The selenium sites were found by the program SOLVE (Terwilliger, T.C. 
and J. Berendzen. "Automated MAD and MIR structure solution:. Acta Cryst. D55: 
5 849-861 (1999)). The program found a solution based on nine sites that gave the 
phasing statistics described in Table 1 1 . The resulting map was readily interpreted 
and the polypeptide could easily be traced. 

Table 1 1 : Phasing statistics using MLPHARE 
10 Figure of Merit < 0.1 0.2 0.3 0.4 
0.9 1.0 

# reflections 1510 1498 1426 1421 

2249 2654 

1 5 Figure of Merit with Resolution 



DMIN: Total 


8.31 


5.61 


4.49 


3.85 


3.42 


3.11 


2.87 


2.68 


N: 16697 


937 


1416 


1761 


2056 


2300 


2554 


2735 


2938 


MEAN FOM: 0.55 


0.79 


0.77 


0.68 


0.63 


0.56 


0.52 


0.46 


0.34 



20 More than 90% of the amino acid sequence was traced with relatively high 

confidence from the first map using the program O (T.A. Jones, J.Y. Zou ? S.W. 
Cowan & M. Kjeldgaard, "Improved methods for building protein models in 
electron density maps and the location of errors in these models:, Acta Cryst. A47: 
1 10-1 19 (1991)). The co-location of the two proposed binding site cysteine residues 

25 (Cys76 and Cysl85, Ca-Ca distance 7.5 A) as well as the 9 easily identified Seleno- 
Methionine sites validated the chain tracing. 

The two monomers were built independently so that differences in the 
relative position of the two main domains as well as secondary structural elements 
could be readily observed. 

30 The monomer is a two domain structure wherein each domain has an alpha- 

beta type fold. The binding site residues reside in the interface between the two 
domains. 



0.5 0.6 0.7 0.8 
1338 1373 1495 1733 
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E.faecalis has two binding sites, each on opposing sites of the dimer face. 
That this is a biologically relevant dimer was confirmed by the conservation of the 
residues located in the dimer interface within the family of Gram positive bacteria. 
Of interest is that the one binding site was found to bind D-glutamate, whereas the 
5 other binding site bound L-glutamate. The binding of L-glutamate seems to induce 
significant conformational differences in the active site that seemingly propagates 
throughout the structure. The protein has amino acids that are used to fold the 
polypeptide chain back onto the structure to almost form an additional strand on one 
of the beta sheets. 

0 

D. Refinement of the Crystal Structure 



The initial model was refined with a simulated annealing protocol using torsion 
angle dynamics as implemented in the program CNS (Crystallography & NMR 

15 System, Acta. Cryst. D54: 905-921 (1998)). After the first round of refinement, the 
model had an R-value of 28.6% and an R-free value of 36.8%. Further model 
building and refinement yielded a model comprising 4243 atoms corresponding to 2 
x 268 amino acids, one D-glutamate, one L-glutamate, and 1 84 ordered water 
molecules. The final refinement statistics for CNS are as follows: Final R = 0.2045 

20 and R-free = 0.2567. 



Example 5: Crystallization and Characterization of Murl from S. aureus 
Cloning and purification of Murl from S. aureus has been previously 
25 described in U.S. Provisional Application 60/435,272. 



A. Crystallization of Murl from S. aureus 



30 



Crystals were obtained by vapor diffusion using the hanging drop method. 
("Protein Crystallization" Terese M. Bergfors (Editor), International University 
Line, pages 7-15, 1999). 
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Protein had been stored in 50 microliters aliquots at -80 degrees at a 
concentration of 10 mg/ml in a buffer containing 5 mM D,L-glutamic acid; about 1 
mM TCEP; and about 200 mM ammonium acetate pH 7.4. 

The reservoir solution typically contained 150 mM ammonium sulfate and 
5 25% PEG 8000, and 15% glycerol. 

Drops were set up by mixing 2 microliters of protein solution with 2 or 4 
microliters of reservoir solution. Box-shaped crystals appeared within a couple of 
days and reached a typical size of 0.3 x 0.2 x 0.2 (mm 3 ), in a week. Selected crystals 
were flash frozen in a cold nitrogen stream and tested on an in-house detector. 
10 The crystal belongs to the monoclinic space group C2 and has cell 

dimensions of a = 96.43 A, b = 88.87 A, c = 96.56 A, a = 90°, p = 109.00°, and y= 
90°. 

B. Data Collection on S. aureus Murl 

15 

Crystals diffracted to about 3.0 A resolution using in-house X-ray source 
(MarResearch 345 mm image-plate detector system with X-ray generated on Rigaku 
RU300HB rotating anode operated at 50 kV and 100m A). The S. aureus structure 
was solved with molecular replacement using the E. faecalis model, so only a single 

20 data set was collected at beam line 71 1 at MaxLab to 2. 1 5 A resolution. The data 
sets were processed, scaled and merged using the programs MOSFLM, SCALA, 
TRUNCATE and SCALEIT (The CCP4 Suite: Programs for Protein 
Crystallography". Acta. Cryst. D50, 760-763, Yao Jia-xing, (1981). Acta. Cryst. 
A37, 642-644). The data sets comprised 169158 measurements of 41453 unique 

25 reflections giving a multiplicity of 4.0, a completeness of 98.2% and an overall R- 
merge of 9.9% (34.3% in the highest resolution bin). 

C. Phase Determination of S. aureus Murl by molecular 
replacement 

30 

The structure was solved by molecular replacement using the program 
MOLREP (The CCP4 Suite: "Programs for Protein Crystallography", Acta Cryst. 
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D50: 760-763, J. Navaza, Acta Cryst. A50: 157-163 (1994)) and a pruned model of 
a monomer from the previously solved structure of E. faecalis as a search model. 
From the two peaks in the rotation function, the S. aureus Murl protein was 
determined to be an asymmetric dimer. The final solution at 4 A had an R-value of 
5 0.44 and a correlation of 0.49. The map showed clear density for the side chains that 
are different between the two species confirming a correct solution. 

The model was easily corrected using the software ONO (T.A. Jones et al. 
Acta Cryst. A47: 110-119 (1991)) and a model corresponding to residues 3 to 268 
could be constructed from the first map. 

10 

D. Refinement of the S. aureus Crystal Structure 



The initial model was refined with a simulated annealing protocol using 
torsion angle dynamics as implemented in the program CNS (Crystallography & 
15 NMR System, Acta. Cryst. D54: 905-921 (1998)). After one cycle of simulated 

annealing an R-value of 0.27 and an R-free value of 0.30 was achieved. Additional 
refinement and model building gave a final model consisting of two polypeptide 
chains of 262 amino acids (residues 1-262), two D-glutamates, and 133 ordered 
water molecules. The final values were R = 0.2020 and R-free = 0.2290. 

20 

Example 6: Crystallization and Characterization of Murl from E. faecium 

Cloning and purification of Murl from E. faecium has been previously 
described in U.S. Provisional Application 60/435,272. 



25 A. Crystallization 

Crystals were obtained by vapour diffusion using the hanging drop method. 
(Protein Crystallization, Terese M. Bergfors (Ed.), Published 1999). 

Protein was stored at -80 °C at a concentration of 10 mg/ml in a buffer 
30 containing 200 mM ammonium acetate pH 7.4, 5 mM D,L-glutamic acid, and 1 mM 
TCEP. 
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The reservoir solution typically contained 100 mM sodium di-hydrogen 
citrate pH 5.6, and (1) 0.6-0.7M ammonium di-hydrogen phosphate, (2) 200 mM tri- 
sodium citrate dehydrate and 20% polyethylene glycol 3350, or (3) 200 mM di- 
sodium tartrate dehydrate and 20% polyethylene glycol 3350. 
5 Two microliters of the protein solution were mixed with 2 or 4 microliters of 

the reservoir solution to make drops. Bipyramidal crystals appeared within a day, 
and reached a typical size of 0.3 x 0.3 x 0.2 mm 3 within one week. 

One crystal belongs to the centered monoclinic space group P3i2i with cell 
dimensions of (phosphate) a = b = 85.82 A and c = 92.25 A, a = 90 °, P = 90 ° and y 
10 =120 °; (citrate) a = b = 85.42 A and c = 92.91 A, a = 90 °, p = 90 ° and y = 120 °; 
and (tartrate) a = b = 85.16 A and c = 96.56 A, a = 90 °, p = 109 ° and y = 90 °. 
Crystals can be flash-cooled to 95K directly from the drop without ice formation. 

B. Data Collection 

15 

Crystals were checked for diffraction using an in-house X-ray source 
(MarResearch 345 mm image-plate detector system with X-ray generated on Rigaku 
RU300HB rotating anode operated at 50 kilovolts and 100 milliamps). Complete 
data sets were collected at beam line 71 1 at MaxLab to 1 .8 A (phosphate and citrate) 
20 and 2.0 A (tartrate) resolution. The data sets were processed, scaled and merged 
using MOSFLM, SCALA and TRUNCATE (The CCP4 Suite: "Programs for 
Protein Crystallography", Acta Cryst. D50, 760-763). The data sets include: 

Phosphate: 732191 measurements of 36792 unique reflections giving a 
multiplicity of 19.9, a completeness of 99.4% and an overall R-merge of 8.0%; 
25 Citrate: 151895 measurements of 34849 unique reflections giving a 

multiplicity of 4.4, a completeness of 95.3% and an overall R=merge of 4.9%; and 

Tartrate: 85092 measurements of 25682 unique reflections giving a 
multiplicity of 3.3, a completeness of 94.4% and an overall R-merge of 6.6%. 



30 
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C. Phase determination by Molecular Replacement (MR) 

The structure was solved by molecular replacement using the phosphate data 
set and the program MOLREP (The CCP4 Suite: "Programs for Protein 
5 Crystallography", Acta Cryst. D50, 760-763, J. Navaza, Acta Cryst. A50, 157-163 
(1994) and a pruned model of a monomer from the previously solved structure of E. 
faecalis as a search model. From the single major peak in the rotation function it 
was obvious that there was only a monomer in the asymmetric unit but with a 
twofold crystallographic axis creating the same dimer as seen in the solved E. 

10 faecalis and S. aureus structures. The final solution at 4 A had an R-value of 0.448 
and a correlation of 0.504. The map showed clear density for the side chains that are 
different between E.faecium and E. faecalis confirming a correct solution. 

The incomplete model was corrected manually using the software ONO 
(T.A. Hones, J.Y. Zou, S.W. Cowan & M. Kjeldgaard, "Improved methods for 

15 building protein models in electron density maps and the location of errors in these 
models", Acta Cryst. A47, 1 10-1 19 (1991)) and a model corresponding to residues 
2-273 was constructed from the first map. 

D. Refinement of the Crystal Structure 

20 

A first round of refinement was performed using the program CNS 
(Crystallography & NMR System. Acta Cryst. (1998) D54: 905-921). After one 
cycle of simulated annealing, an R-value of 0.27 and an R-free value of 0.30 was 
achieved. Additional refinement and model building gave a final model consisting 

25 of one polypeptide chain of 271 amino acid residues (residues 1-271 of SEQ ID NO: 
48), two phosphate ions and 250 ordered water molecules. The final R-values were 
R = 0.1927 and R-free = 0.2092. 

The citrate and tartrate structures were solved by molecular replacement 
using MOLREP and the phosphate model as a search model. They were refined to 

30 1 .8 A and 2.0 A resolution for citrate and tartrate, respectively. The citrate structure 
consists of one polypeptide chain of 271 amino acids (residues 1-271 of SEQ ID 
NO: 48), one citrate molecule and 282 ordered water molecules. The final R-values 
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were R = 0.1991 and R-free 0.2073. The tartrate structure consists of one 
polypeptide chain of 271 amino acids (residues 1-271 of SEQ ID NO: 48), one 
tartrate molecule, and 296 ordered water molecules. The final R-values were R = 
0.2036 and R-free = 0.244. 

5 

Example 7: Isolation and Identification of Mutant Murl Proteins 

Two-fold serial dilutions of Murl inhibitors (AR-B051082, AR-B051076, 
and AR-B052184), ranging from 0.25 to 32 x MIC were prepared in agar media in 

10 triplicate. After transfer of H. pylori cells (> 10 cfu) onto the plates, the plates were 
incubated at 37 °C in a tri-gas incubator (5% oxygen, 85% nitrogen, 10% carbon 
dioxide) for seven (7) days. Colonies isolated from plates with compound 
concentrations above MIC levels were selected and transferred to agar media 
without inhibitor, prior to growth on agar media containing inhibitor to confirm 

1 5 acquisition of a stable resistance phenotype. 

Chromosomal DNA was extracted from representative colonies and the murl 
gene was amplified using polymerase chain reaction (PCR) with template primers 
TGATGCAACAAATGGACGA (SEQ ID NO: 75) and 

TTACAATTTGAGCCATTC (SEQ ID NO: 76). Mutations in the murl gene were 
20 identified using DNA sequencing. Samples of the mutant murl PCR products were 

transformed into wild-type H. pylori and subsequently grown on agar media 

containing Murl inhibitors at concentrations above the MIC of the wild-type strain. 
Confluent growth on those plates relative to control on media containing 

inhibitor showing the presence of mutated Murl confirmed that resistance was 
25 mediated by expression of mutant Murl proteins. 

Mutant Protein Overexpression and Characterization: 

Cloned J99 murl was altered to encode either mutation A75T (G223A) or 
30 El 5 IK (G451A) using site-directed mutagenesis in the expression vector pET23a. 
The mutated clones were shown by transformation to confer resistance to All 244 
acrB'. For overexpression, the above constructs were co-transformed with a plasmid 
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encoding GroEL into BL21(DE3) and grown in LB with D/L glutamate. Induction 
was done using 0.5 mM IPTG at room temperature overnight. The expression levels 
of the mutated Murl proteins were the same as wild-type Murl and with about 50- 
80% solubility. The expressed proteins were purified using the procedure described 
5 for the wild-type Murl protein (>20mg, >99% pure). Kinetic analysis of Murl 
resistance mutant proteins were measures in the forward (L-Glu -> D-Glu) and 
reverse direction (D-Glu -> L-Glu). In the forward direction, no substrate inhibition 
by L-Glu was observed and the turnover numbers were very similar to the wild-type 
enzyme (£ cat ~90 min 1 ), but the Km for L-Glu was elevated by 10-fold, giving an 

10 overall drop in catalytic efficiency (A: ca t/K M ) of 10-fold relative to the native enzyme. 
In the reverse direction (D-Glu -> L-Glu) the mutant enzyme exhibits substrate 
inhibition by D-glutamate, but the inhibition constant is shifted > 120-fold relative 
to the wild-type enzyme (i.e. K IjD _ G i u A75T - 680 jiM, Ki jD -giu wt - 5 |^M). Further, 
the K M for D-Glu for the mutant enzyme is elevated relative to the wild-type enzyme 

15 (i.e. K m ,d-giu A75T - 280 |aM, K M ,D-ciu wt - 70 (aM). 

Example 8: MAPS 

MAPS has been described in detail above and was used, for example, to find 
20 common structural units among Gram positive Murl from three bacterial species. 

Abbreviations used in the MAPS example: Sa = 5. aureus, Ef = E.faecalis, and Ef2 
= E.faecium, 

Automatic mode of the program: 
25 Least matching rate for the two molecules 0.33 
Least second matching rate 0.90 

superimposed structure will be written out 

Shortest residues of one fragment 3 

Maximum distance between Ca atoms of aligned residues 3.800000 

30 

Total 3 models will be used for 3d comparison 

Model 1 Residues 113 Name: S A PDB file: muri_s_aureus ^patent 1 
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Model 2 Residues 112 Name: EF PDB file: muri_e_faecalis_patent 114 
Model 3 Residues 1 1 1 Name: EF2 PDB file: ef2_z.pdb 226 

Maximum residue number 1 1 3 

* Secondary structure assignment completed 

5 1 1 3 Ca atoms read 6 helixes 4 strands in file muri_s_aureus_patent_z.pdb 

* Secondary structure assignment completed 

1 12 Ca atoms read 6 helixes 4 strands in file muri_e_faecalis_patent_z.pdb 

1 chains in Mol2 
Chain A Nr 1 10 

10 

Matching residues 1 1 1 Identical residues 51 Identity 45.9% 
Match Rate(l) 98.2% r.m.s. of atoms 0.92 Mean distance 0.71 
structure diversity 0.95 with 1 1 1 residues in match 1 of muri_e_ 

15 «<Total 1 way to align the new structure to muri e_faecalisjpatent_z.pdb 

murine faecalis_j>atent_z.pdb<->muri_s_aureus_patent_z.pdb Max Align: 10 Max 
Match: 10 

Best Topological Diversity 3.2 with 10 matched SS to muri __e_ 
Best Structure Diversity 0.95 with 1 1 1 matched residues to muri_e_ 
20 #Ca 1 1 1 RMS: 0.9 dist. 0.7 str.div: 1.0 top.div 3.2 



* Secondary structure assignment completed 
25 1 1 1 Ca atoms read 6 helixes 4 strands in file ef2_z.pdb 
1 chains in Mol2 
Chain A Nr 1 10 



Matching residues 109 Identical residues 45 Identity 41.3% 
30 Match Rate(l) 96.5% r.m.s. of atoms 0.94 Mean distance 0.76 
structure diversity 1.01 with 109 residues in match 1 of ef2_z 
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«<Total 1 way to align the new structure to ef2_z.pdb 
ef2_z.pdb<->muri_s_aureus_patent_z.pdb Max Align: 10 Max Match: 10 
Best Topological Diversity 4.5 with 10 matched SS to ef2_z 
Best Structure Diversity 1 .01 with 109 matched residues to ef2_z 
5 #Ca 109 RMS: 0.9 dist. 0.8 str.div: 1.0 top.div 4.5 



* Secondary structure assignment completed 

10 1 12 Ca atoms read 6 helixes 4 strands in file muri_e_faecalis_patent_z.pdb 

* Secondary structure assignment completed 

1 1 1 Ca atoms read 6 helixes 4 strands in file ef2_z.pdb 

1 chains in Mo 12 
Chain A Nr 1 10 
15 

Matching residues 111 Identical residues 78 Identity 70.3% 
Match Rate(l) 99.1% r.m.s. of atoms 0.49 Mean distance 0.41 
structure diversity 0.50 with 111 residues in match 1 of ef2_z 

20 «<Total 1 way to align the new structure to ef2_z.pdb 

ef2_z.pdb<->muri_e_faecalis__patent_z.pdb Max Align: 10 Max Match: 10 
Best Topological Diversity 2.0 with 10 matched SS to ef2_z 
Best Structure Diversity 0.50 with 1 1 1 matched residues to ef2_z 
#Ca 1 1 1 RMS: 0.5 dist. 0.4 str.div: 0.5 top.div 2.0 

25 ~ 

Comparisons between — S. aureus and E.faecalis 
30 begin end Sequence Matching 



A97 A 158 ILPGARAAVKVTKNNK IGV IGTLGTIKSASYD IAIKSKAPAI EVT SLACP 
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A95 A 1 56 IEPGARTA I MTTRNQN VLVLGTEGTIKSEA YRTHIKRINPH VE VHG V ACP 
I I I I I I II I I I I III I I II III III 

KFVPI VES NQYR 
5 GFVPLVEQMRYS 
I I I I I I 

A 1 60 A208 S VAKKIVAE TLQALQ LKGLDTLILGCTHYPLLRP VIQNVMGSH VTLIDS 
A 1 59 A207 TV IS I VI HQTLKRWRNS ESDTVILGCTHYPLLYKPI YDYFGGKKTVISS 



Matching residues 1 1 1 Identical residues 51 Identity 45.9% 
Match Rate(l) 99.1% r.m.s. ofCa 0.92 Mean distance 0.71 

15 SA EF2 

Comparisons between — SA & EF2 



begin end Sequence Matching 

20 

A 1 00 A 1 6 1 ILPGTRAAVKKTQNKQ VGIIGTIGT VKSQAYEKALKEKVPELTVTSLACP 
A95 A 156 IEPGARTAIMTTRNQNVLVLGTEGTIKSEAYRTHIKRINPHVEVHGVACP 
I I I I III I I I I I I I I I I I I II I 

25 KFVSVVESNEYH 
GFVPLVEQMRYS 
I I I I I 

A 1 63 A 1 77 SVAKKIVAETLAPLT 
30 A159A173 TVISIVIHQTLKRWR 
I I I 

A 1 79 A2 1 0 KKIDTLILGCTH YPLLRPIIQN VMGEN VQLID 
A 175 A206 SESDTVILGCTHYPLLYKPIYDYFGGKKTVIS 
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Matching residues 109 Identical residues 45 Identity 41.3% 
Match Rate(l) 98.2% r.m.s. of Ca 0.94 Mean distance 0.76 

EF EF2 

5 

Comparisons between — EF & EF2 



begin end Sequence Matching 



10 A 1 00 A2 1 0 ILPGTRAAVKKTQNKQ VGIIGTIGT VKSQAYEKALKEKVPELTVTSLACP 
A97 A207 ILPGARAAVKVTKNNK1GVIGTLGTIKSASYDIAIKSKAPAIEVTSLACP 
.1111 I I I I I II I II I II II I II II I II I I II 

KFVSVVESNEYHSSVAKKIVAETLAPLTTKKIDTLILGCTHYPLLRPIIQ 
1 5 KFVPIVESNQYRSSVAKKIVAETLQALQLKGLDTLILGCTHYPLLRP VIQ 

III II I I I II I I I Ml I I II I I I I III I I II I I I II I II 

NVMGENVQLID 
NVMGSHVTLID 
20 | | M I 1 1 1 

Matching residues 1 1 1 Identical residues 78 Identity 70.3% 
Match Rate(l) 100.0% r.m.s. of Ca 0.49 Mean distance 0.41 

25 Average fitting residues: 1 10.3 



Matrix of fitting residues 



30 SA EF EF2 
SA 

EF 

111. 
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EF2 

109. 111. 



5 Sequence identities of aligned residues 



SA EF EF2 

SA 
10 EF 

0.4595 

EF2 

0.4128 0.7027 

15 

Matrix for fitting scores 



SA EF EF2 
SA 

20 

EF 

0.71 

EF2 

0.77 0.40 

25 

Refine all the pairwise alignment.... 

RMS deviation 0.809 Mean distance 0.626 with 3 structures 
30 Search equivalent residues among all the structures .... 

Cycle= 1 RMS 0.816 Mean-distance 0.628 with 1 10 aligned residues 
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Cycle= 2 RMS 0.816 Mean-distance 0.628 with 110 aligned residues 



Total 3 fragment with 110 residues are 

5 equivalent residues in this set of the structures 



Fragment 1 length 63 

SA A95 IEPGARTAIMTTR NQNVLVLGTEGTIKSEAYRTHI KRINPHVEVHGVACP 
1 0 EF A97 ILPGARAAVKVTKNNK IG VI GTLGTIKS AS YDI AI KSKAPAIE VTSLACP 
EF2 A 1 00 ILPGTRAA VKKTQNKQ VG I IGTIGT VKSQ A YEKALKEKVPELT VTSLACP 
I II I I II III I I II I I I III 



SA GFVPLVEQMRYSD A157 

15 EF KFVP I VESNQYRS A159 

EF2 KFVSVVESNEYHS A 162 

II II I 



Fragment 2 length 15 

20 SA A 159 TV I S IVIHQTLKRWR A 173 

EF A 160 SVAKKIVAETLQALQ A 174 

EF2 A 163 SVAKKIVAETLAPLT A 177 
I I I 



25 Fragment 3 length 32 

SA A 175 SESD TVILGCTHYPLLYKPIYDYFGGKKTVIS A206 

EF A 176 KGLDTLILGCTHYPLLRPVIQNVMGSHVTLID A207 

EF2 A 1 79 KKI DTLILGCTH YPLLRP IIQNVMGENVQLID A2 1 0 
I I II I I I I II II I I I 

30 



Total 43 residues are identical among all 3 structures 
Rate of overall identity 0.391 
Statistics for residues which share least identity 
35 SA 52 0.473 
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EF 85 0.773 

EF2 80 0.727 

Generate the superposed models based on the multiple alignment 
EF2 is not changed 
5 File: muri_e_fecalis_patent_z.pdb_maps for EF 
File: ef2_z.pdb_maps for EF2 
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SEQUENCE LISTING 

<110> Anderson et al . 

<120> CRYSTAL STRUCTURE OF GLUTAMATE RACEMASE (MURI) 

<130> ASZD-P01-007 

<140> Not Assigned 
<141> Filed Herewith 

<150> 60/435,272 
<151> 2002-12-20 

<150> 60/435,167 
<151> 2002-12-20 

<150> 60/435,087 
<151> 2002-12-20 

<150> 60/435,527 
<151> 2002-12-20 

<160> 76 

<170> Patentln version 3.1 

<210> 1 

<211> 768 

<212> DNA 

<213> H. pylori 

<221> CDS 

<222> (1) . . (768) 

<400> 1 

atg aaa ata ggc gtt ttt gat age ggt gtg ggg ggg ttt age gtt tta 
48 

Met Lys lie Gly Val Phe Asp Ser Gly Val Gly Gly Phe Ser Val Leu 
15 10 15 

aaa age ctt tta aaa gcg cga ttg ttt gat gaa ate ate tac tat ggc 
96 

Lys Ser Leu Leu Lys Ala Arg Leu Phe Asp Glu lie lie Tyr Tyr Gly 

20 25 30 

gat age get aga gtg cct tat ggc act aaa gac ccc ace acg ate aag 
144 

Asp Ser Ala Arg Val Pro Tyr Gly Thr Lys Asp Pro Thr Thr lie Lys 
35 40 45 

caa ttt ggc tta gag get ttg gat ttt ttc aaa ccg cat gag att gaa 
192 

Gin Phe Gly Leu Glu Ala Leu Asp Phe Phe Lys Pro His Glu lie Glu 

50 55 60 



tta ttg att gtg gca tgc aac acc gcg age get ctg get tta gaa gag 
240 

Leu Leu lie Val Ala Cys Asn Thr Ala Ser Ala Leu Ala Leu Glu Glu 
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65 70 75 80 

atg caa aag tat tct aaa ate cct att gtg ggc gtg att gag cca age 
288 

Met Gin Lys Tyr Ser Lys He Pro He Val Gly Val He Glu Pro Ser 
85 90 95 

att tta gcg ate aag egg caa gtg gaa gat aaa aac gec cct att tta 
336 

He Leu Ala He Lys Arg Gin Val Glu Asp Lys Asn Ala Pro He Leu 
100 105 110 

gtg eta ggg aca aaa gcg acg att caa tec aac gee tat gac aac gee 
384 

Val Leu Gly Thr Lys Ala Thr He Gin Ser Asn Ala Tyr Asp Asn Ala 
115 120 125 

ctg aaa caa caa ggc tat ttg aac att teg cat tta get act tct ctt 
432 

Leu Lys Gin Gin Gly Tyr Leu Asn He Ser His Leu Ala Thr Ser Leu 
130 135 140 

ttt gtg cct ttg att gaa gaa agt att tta gag ggc gaa ttg tta gaa 
480 

Phe Val Pro Leu He Glu Glu Ser He Leu Glu Gly Glu Leu Leu Glu 
145 150 155 160 

act tgc atg cat tat tat ttc act ccc tta gag att tta ccc gaa gtg 
528 

Thr Cys Met His Tyr Tyr Phe Thr Pro Leu Glu He Leu Pro Glu Val 
165 170 175 

ate att tta ggt tgc acg cat ttt ccc tta ate get caa aaa att gag 
576 

He lie Leu Gly Cys Thr His Phe Pro Leu He Ala Gin Lys He Glu 
180 185 190 

ggc tat ttc atg ggg cat ttt gee ctt cca acg ccc ccc eta etc ate 
624 

Gly Tyr Phe Met Gly His Phe Ala Leu Pro Thr Pro Pro Leu Leu He 
195 200 205 

cat teg ggc gat get att gta gaa tat ttg caa caa aaa tac gee ctt 
672 

His Ser Gly Asp Ala He Val Glu Tyr Leu Gin Gin Lys Tyr Ala Leu 
210 215 220 

aaa aac aat gca tgc aca ttc cct aaa gtg gaa ttt cat gcg age ggc 
720 

Lys Asn Asn Ala Cys Thr Phe Pro Lys Val Glu Phe His Ala Ser Gly 
225 230 235 240 

gat gtg ate tgg eta gaa aga caa get aaa gaa tgg etc aaa ttg taa 
768 

Asp Val lie Trp Leu Glu Arg Gin Ala Lys Glu Trp Leu Lys Leu 
245 250 255 

<210> 2 
<211> 255 
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<212> PRT 
<213> H. pylori 

<400> 2 

Met Lys lie Gly Val Phe Asp Ser Gly Val Gly Gly Phe Ser Val Leu 
15 10 15 



Lys Ser Leu Leu Lys Ala Arg Leu Phe Asp Glu lie lie Tyr Tyr Gly 
20 25 30 



Asp Ser Ala Arg Val Pro Tyr Gly Thr Lys Asp Pro Thr Thr lie Lys 
35 40 45 



Gin Phe Gly Leu Glu Ala Leu Asp Phe Phe Lys Pro His Glu lie Glu 
50 55 60 



Leu Leu lie Val Ala Cys Asn Thr Ala Ser Ala Leu Ala Leu Glu Glu 
65 70 75 80 



Met Gin Lys Tyr Ser Lys lie Pro lie Val Gly Val lie Glu Pro Ser 
85 90 95 



lie Leu Ala lie Lys Arg Gin Val Glu Asp Lys Asn Ala Pro lie Leu 
100 105 110 



Val Leu Gly Thr Lys Ala Thr lie Gin Ser Asn Ala Tyr Asp Asn Ala 
115 120 125 



Leu Lys Gin Gin Gly Tyr Leu Asn lie Ser His Leu Ala Thr Ser Leu 
130 135 140 



Phe Val Pro Leu lie Glu Glu Ser lie Leu Glu Gly Glu Leu Leu Glu 
145 150 155 160 



Thr Cys Met His Tyr Tyr Phe Thr Pro Leu Glu lie Leu Pro Glu Val 
165 170 175 



lie lie Leu Gly Cys Thr His Phe Pro Leu lie Ala Gin Lys lie Glu 
180 185 190 



Gly Tyr Phe Met Gly His Phe Ala Leu Pro Thr Pro Pro Leu Leu lie 
195 200 205 



His Ser Gly Asp Ala lie Val Glu Tyr Leu Gin Gin Lys Tyr Ala Leu 
210 215 220 
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Lys Asn Asn Ala Cys Thr Phe Pro Lys Val Glu Phe His Ala Ser Gly 
225 230 235 240 



Asp Val lie Trp Leu Glu Arg Gin Ala Lys Glu Trp Leu Lys Leu 
245 250 255 

<210> 3 
<211> 768 
<212> DNA 
<213> H. pylori 

<221> CDS 

<222> (1) . . (768) 

<400> 3 

atg aaa ata ggc gtt ttt gat age ggt gtg ggg ggg ttt age gtt tta 
48 

Met Lys lie Gly Val Phe Asp Ser Gly Val Gly Gly Phe Ser Val Leu 
15 10 15 

aaa age ctt tta aaa gcg caa ttg ttt gat gaa ate ate tat tat ggc 
96 

Lys Ser Leu Leu Lys Ala Gin Leu Phe Asp Glu lie lie Tyr Tyr Gly 
20 25 30 

gat age get aga gtg cct tat ggc act aaa gac ccc act acg ate aag 
144 

Asp Ser Ala Arg Val Pro Tyr Gly Thr Lys Asp Pro Thr Thr lie Lys 
35 40 45 

caa ttt ggc tta gag get ttg gat ttt ttc aaa cca cac cag att gaa 
192 

Gin Phe Gly Leu Glu Ala Leu Asp Phe Phe Lys Pro His Gin lie Glu 
50 55 60 

tta ttg att gtg gca tgc aac acc gca age get ctg get tta gaa gag 
240 

Leu Leu lie Val Ala Cys Asn Thr Ala Ser Ala Leu Ala Leu Glu Glu 
65 70 75 80 

atg caa aag cat tec aaa ate cct att gtg ggc gtg att gag cca age 
288 

Met Gin Lys His Ser Lys lie Pro lie Val Gly Val lie Glu Pro Ser 
85 90 95 

att tta gcg ate aag caa caa gtg aaa gat aaa aac gec cct att tta 
3 36 

lie Leu Ala lie Lys Gin Gin Val Lys Asp Lys Asn Ala Pro lie Leu 
100 105 110 

gtg eta ggg aca aaa gcg acg att caa tec aac get tat gac aac gee 
384 

Val Leu Gly Thr Lys Ala Thr lie Gin Ser Asn Ala Tyr Asp Asn Ala 
115 120 125 
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ctg aaa caa caa ggc tat ttg aat gtt teg cat tta gec act tct ctt 
432 

Leu Lys Gin Gin Gly Tyr Leu Asn Val Ser His Leu Ala Thr Ser Leu 
130 135 140 

ttt gtg cct ttg att gaa gaa agt att tta gag ggc gaa ttg tta gag 
480 

Phe Val Pro Leu lie Glu Glu Ser lie Leu Glu Gly Glu Leu Leu Glu 

145 150 155 160 

act tgc atg cgt tat tat ttc act ccc tta aag att tta cct gaa gtg 
528 

Thr Cys Met Arg Tyr Tyr Phe Thr Pro Leu Lys lie Leu Pro Glu Val 
165 170 175 

att att tta ggt tgc acg cat ttt ccc ttg att get caa aaa att gag 
576 

lie lie Leu Gly Cys Thr His Phe Pro Leu He Ala Gin Lys He Glu 

180 185 190 

ggc tat ttc atg gag cat ttt gec ctt cca acg ccc ccc eta etc ate 
624 

Gly Tyr Phe Met Glu His Phe Ala Leu Pro Thr Pro Pro Leu Leu He 

195 200 205 

cat teg ggc gat get att gta gaa tat ttg cag caa aaa tac gec ctt 
672 

His Ser Gly Asp Ala He Val Glu Tyr Leu Gin Gin Lys Tyr Ala Leu 
210 215 220 

aaa aac aat gca cac gca ttc cct aaa gtg gaa ttt cat gcg age ggc 
720 

Lys Asn Asn Ala His Ala Phe Pro Lys Val Glu Phe His Ala Ser Gly 
225 230 235 240 

gat gtg ate tgg eta gaa aga caa get aaa gaa tgg etc aaa ttg taa 
768 

Asp Val He Trp Leu Glu Arg Gin Ala Lys Glu Trp Leu Lys Leu 
245 250 255 

<210> 4 
<211> 255 
<212> PRT 
<213> H. pylori 

<400> 4 

Met Lys He Gly Val Phe Asp Ser Gly Val Gly Gly Phe Ser Val Leu 
15 10 15 



Lys Ser Leu Leu Lys Ala Gin Leu Phe Asp Glu He He Tyr Tyr Gly 
20 25 30 



Asp Ser Ala Arg Val Pro Tyr Gly Thr Lys Asp Pro Thr Thr He Lys 
35 40 45 
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Gin Phe Gly Leu Glu Ala Leu Asp Phe Phe Lys Pro His Gin lie Glu 
50 55 60 



Leu Leu lie Val Ala Cys Asn Thr Ala Ser Ala Leu Ala Leu Glu Glu 
65 70 75 80 



Met Gin Lys His Ser Lys lie Pro lie Val Gly Val lie Glu Pro Ser 
85 90 95 



lie Leu Ala lie Lys Gin Gin Val Lys Asp Lys Asn Ala Pro lie Leu 
100 105 110 



Val Leu Gly Thr Lys Ala Thr lie Gin Ser Asn Ala Tyr Asp Asn Ala 
115 120 125 



Leu Lys Gin Gin Gly Tyr Leu Asn Val Ser His Leu Ala Thr Ser Leu 
130 135 140 



Phe Val Pro Leu lie Glu Glu Ser lie Leu Glu Gly Glu Leu Leu Glu 
145 150 155 160 



Thr Cys Met Arg Tyr Tyr Phe Thr Pro Leu Lys lie Leu Pro Glu Val 
165 170 175 



lie lie Leu Gly Cys Thr His Phe Pro Leu lie Ala Gin Lys lie Glu 
180 185 190 



Gly Tyr Phe Met Glu His Phe Ala Leu Pro Thr Pro Pro Leu Leu lie 
195 200 205 



His Ser Gly Asp Ala lie Val Glu Tyr Leu Gin Gin Lys Tyr Ala Leu 
210 215 220 



Lys Asn Asn Ala His Ala Phe Pro Lys Val Glu Phe His Ala Ser Gly 
225 230 235 240 



Asp Val lie Trp Leu Glu Arg Gin Ala Lys Glu Trp Leu Lys Leu 
245 250 255 

<210> 5 
<211> 768 
<212> DNA 
<213> H. pylori 



<221> CDS 

<222> (1) . . (768) 
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<400> 5 

atg aaa ata ggc 

48 

Met Lys lie Gly 
1 

aaa age ctt tta 
96 

Lys Ser Leu Leu 
20 

gat age get aga 
144 

Asp Ser Ala Arg 
35 

caa ttt ggc tta 
192 

Gin Phe Gly Leu 
50 

tta ttg att gtg 
240 

Leu Leu lie Val 
65 

atg caa aag cat 
288 

Met Gin Lys His 



att tta gcg ate 
336 

lie Leu Ala lie 
100 

gtg eta ggg aca 
384 

Val Leu Gly Thr 
115 

ctg aaa caa caa 
432 

Leu Lys Gin Gin 
130 

ttt gtg cct ttg 
480 

Phe Val Pro Leu 
145 

act tgc atg cgt 
528 

Thr Cys Met Arg 



att att tta ggt 
576 

He He Leu Gly 



gtt ttt gat age 

Val Phe Asp Ser 
5 

aaa gcg caa tta 
Lys Ala Gin Leu 

gtg cct tat ggc 

Val Pro Tyr Gly 
40 

gag get ttg gat 

Glu Ala Leu Asp 
55 

gca tgc aac aca 

Ala Cys Asn Thr 
70 

tec aaa ate cct 

Ser Lys He Pro 
85 

aag cga caa gta 
Lys Arg Gin Val 

aaa gcg acg ate 

Lys Ala Thr He 
120 

ggc tat ttg aat 

Gly Tyr Leu Asn 
135 

att gaa gaa agt 

He Glu Glu Ser 
150 

tat tat ttc act 

Tyr Tyr Phe Thr 
165 

tgc acg cat ttt 
Cys Thr His Phe 
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ggt gtg gga ggg 

Gly Val Gly Gly 
10 

ttt gat gaa ate 

Phe Asp Glu He 
25 

act aaa gac ccc 
Thr Lys Asp Pro 

ttt ttc aaa ccg 

Phe Phe Lys Pro 
60 

gcg age get eta 

Ala Ser Ala Leu 
75 

att gtg ggc gtg 

He Val Gly Val 
90 

aaa gat aaa aac 

Lys Asp Lys Asn 
105 

caa tec aac get 
Gin Ser Asn Ala 

gtt teg cat tta 

Val Ser His Leu 
140 

att tta gag ggc 

He Leu Glu Gly 
155 

ccc tta aag att 

Pro Leu Lys He 
170 

ccc tta ate get 
Pro Leu He Ala 



ttt age gtt tta 

Phe Ser Val Leu 
15 

ate tat tat ggc 

He Tyr Tyr Gly 
30 

act acg ate aag 

Thr Thr He Lys 
45 

cac cag att gaa 
His Gin He Glu 

get tta gaa gag 

Ala Leu Glu Glu 
80 

att gag cca age 

He Glu Pro Ser 
95 

gee cct att tta 

Ala Pro He Leu 
110 

tat gac aat gee 

Tyr Asp Asn Ala 
125 

gee act tct ctt 
Ala Thr Ser Leu 

gaa ttg tta gaa 

Glu Leu Leu Glu 
160 

tta ccc gaa gtg 

Leu Pro Glu Val 
175 

caa aaa att gag 
Gin Lys He Glu 
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180 185 190 

ggc tat ttt atg gag cat ttt gcc ctt tea aca ccc ccc eta etc ate 
624 

Gly Tyr Phe Met Glu His Phe Ala Leu Ser Thr Pro Pro Leu Leu lie 
195 200 205 

cat teg ggc gat get att gta gga tat ttg cag caa aaa tac gcc ctt 
672 

His Ser Gly Asp Ala lie Val Gly Tyr Leu Gin Gin Lys Tyr Ala Leu 
210 215 220 

aaa aaa aat gca cac gca ttc cct aaa gtg gaa ttt cat gcg age ggc 
720 

Lys Lys Asn Ala His Ala Phe Pro Lys Val Glu Phe His Ala Ser Gly 

225 230 235 240 

gat gtg ate tgg eta gaa aaa caa get aaa gaa tgg etc aaa ttg taa 
768 

Asp Val lie Trp Leu Glu Lys Gin Ala Lys Glu Trp Leu Lys Leu 
245 250 255 

<210> 6 
<211> 255 
<212> PRT 
<213> H. pylori 

<400> 6 - 

Met Lys He Gly Val Phe Asp Ser Gly Val Gly Gly Phe Ser Val Leu 
15 10 15 



Lys Ser Leu Leu Lys Ala Gin Leu Phe Asp Glu He He Tyr Tyr Gly 
20 25 30 



Asp Ser Ala Arg Val Pro Tyr Gly Thr Lys Asp Pro Thr Thr He Lys 
35 40 45 



Gin Phe Gly Leu Glu Ala Leu Asp Phe Phe Lys Pro His Gin He Glu 
50 55 60 



Leu Leu He Val Ala Cys Asn Thr Ala Ser Ala Leu Ala Leu Glu Glu 
65 70 75 80 



Met Gin Lys His Ser Lys He Pro He Val Gly Val He Glu Pro Ser 
85 90 95 



He Leu Ala He Lys Arg Gin Val Lys Asp Lys Asn Ala Pro He Leu 
100 105 110 



Val Leu Gly Thr Lys Ala Thr He Gin Ser Asn Ala Tyr Asp Asn Ala 
115 120 125 
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Leu Lys Gin Gin Gly Tyr Leu Asn Val Ser His Leu Ala Thr Ser Leu 
130 135 140 



Phe Val Pro Leu lie Glu Glu Ser lie Leu Glu Gly Glu Leu Leu Glu 
145 150 155 160 



Thr Cys Met Arg Tyr Tyr Phe Thr Pro Leu Lys lie Leu Pro Glu Val 
165 170 175 



lie lie Leu Gly Cys Thr His Phe Pro Leu He Ala Gin Lys He Glu 
180 185 190 



Gly Tyr Phe Met Glu His Phe Ala Leu Ser Thr Pro Pro Leu Leu He 
195 200 205 



His Ser Gly Asp Ala He Val Gly Tyr Leu Gin Gin Lys Tyr Ala Leu 
210 215 220 



Lys Lys Asn Ala His Ala Phe Pro Lys Val Glu Phe His Ala Ser Gly 
225 230 235 240 



Asp Val He Trp Leu Glu Lys Gin Ala Lys Glu Trp Leu Lys Leu 
245 250 255 

<210> 7 
<211> 749 
<212> DNA 
<213> H. pylori 

<221> CDS 

<222> (1) . . (747) 

<400> 7 

atg aaa ata ggc gtt ttt gat age ggt gtg gga ggg ttt age gtt tta 
48 

Met Lys He Gly Val Phe Asp Ser Gly Val Gly Gly Phe Ser Val Leu 
15 10 15 

aaa age ctt tta aaa gcg caa ttg ttt gat gaa ate ate tat tat ggc 
96 

Lys Ser Leu Leu Lys Ala Gin Leu Phe Asp Glu He He Tyr Tyr Gly 
20 25 30 

gat age get aga gtg cct tat ggc act aaa gac ccc acc acg ate aag 
144 

Asp Ser Ala Arg Val Pro Tyr Gly Thr Lys Asp Pro Thr Thr He Lys 
35 40 45 



caa ttt ggc tta gag get ttg gat ttt ttc aaa ccg cac cag att aaa 
192 
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Gin Phe Gly Leu Glu Ala Leu Asp Phe Phe Lys Pro His Gin lie Lys 
50 55 60 

tta ttg att gtg gca tgc aac aca gcg age get eta get tta gaa gag 
240 

Leu Leu lie Val Ala Cys Asn Thr Ala Ser Ala Leu Ala Leu Glu Glu 
65 70 75 80 

atg caa aag cat tec aaa ate cct att gtg ggc gtg att gag cca age 
288 

Met Gin Lys His Ser Lys lie Pro lie Val Gly Val lie Glu Pro Ser 
85 90 95 

att tta gcg ate aag caa caa gta aaa gat aaa aac gee cct att tta 
336 

lie Leu Ala lie Lys Gin Gin Val Lys Asp Lys Asn Ala Pro lie Leu 
100 105 110 

gtg eta ggg aca aaa gcg acg ate caa tec aac get tat gac aac gee 
384 

Val Leu Gly Thr Lys Ala Thr He Gin Ser Asn Ala Tyr Asp Asn Ala 
115 120 125 

ctg aaa caa caa ggc tat ttg aat gtt teg cat tta gee act tct ctt 
432 

Leu Lys Gin Gin Gly Tyr Leu Asn Val Ser His Leu Ala Thr Ser Leu 
130 135 140 

ttt gtg cct ttg att gaa gaa agt att tta ggg ggc gaa ttg tta gaa 
480 

Phe Val Pro Leu He Glu Glu Ser He Leu Gly Gly Glu Leu Leu Glu 
145 150 155 160 

act tgc atg cgt tat tat ttc act ccc tta aag att tta cct gaa gtg 
528 

Thr Cys Met Arg Tyr Tyr Phe Thr Pro Leu Lys He Leu Pro Glu Val 

165 170 175 

att att tta ggt tgc acg cat ttt ccc ttg ate get caa aaa att gag 
576 

He He Leu Gly Cys Thr His Phe Pro Leu He Ala Gin Lys He Glu 
180 185 190 

ggc tat ttt atg gag cat ttt gee ctt tea acg ccc ccc eta etc ate 
624 

Gly Tyr Phe Met Glu His Phe Ala Leu Ser Thr Pro Pro Leu Leu He 
195 200 205 

cat teg ggc gat get att gtg gaa tat ttg cag caa aaa tac gee ctt 
672 

His Ser Gly Asp Ala He Val Glu Tyr Leu Gin Gin Lys Tyr Ala Leu 
210 215 220 

aag aaa aat gca cac gca ttc cct aaa gtg gaa ttt cat gcg age ggc 
720 

Lys Lys Asn Ala His Ala Phe Pro Lys Val Glu Phe His Ala Ser Gly 
225 230 235 240 
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gat gtg ate tgg eta gaa aaa cag get aa 
749 

Asp Val lie Trp Leu Glu Lys Gin Ala 
245 

<210> 8 
<211> 249 
<212> PRT 
<213> H. pylori 

<400> 8 

Met Lys lie Gly Val Phe Asp Ser Gly Val Gly Gly Phe Ser Val Leu 
15 10 15 



Lys Ser Leu Leu Lys Ala Gin Leu Phe Asp Glu lie lie Tyr Tyr Gly 
20 25 30 



Asp Ser Ala Arg Val Pro Tyr Gly Thr Lys Asp Pro Thr Thr lie Lys 
35 40 45 



Gin Phe Gly Leu Glu Ala Leu Asp Phe Phe Lys Pro His Gin lie Lys 
50 55 60 



Leu Leu lie Val Ala Cys Asn Thr Ala Ser Ala Leu Ala Leu Glu Glu 
65 70 75 80 



Met Gin Lys His Ser Lys lie Pro lie Val Gly Val lie Glu Pro Ser 
85 90 95 



lie Leu Ala lie Lys Gin Gin Val Lys Asp Lys Asn Ala Pro lie Leu 
100 105 110 



Val Leu Gly Thr Lys Ala Thr lie Gin Ser Asn Ala Tyr Asp Asn Ala 
115 120 125 



Leu Lys Gin Gin Gly Tyr Leu Asn Val Ser His Leu Ala Thr Ser Leu 
130 135 140 



Phe Val Pro Leu lie Glu Glu Ser lie Leu Gly Gly Glu Leu Leu Glu 
145 150 155 160 



Thr Cys Met Arg Tyr Tyr Phe Thr Pro Leu Lys lie Leu Pro Glu Val 
165 170 175 



lie lie Leu Gly Cys Thr His Phe Pro Leu lie Ala Gin Lys lie Glu 
180 185 190 
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Gly Tyr Phe Met Glu His Phe Ala Leu Ser Thr Pro Pro Leu Leu lie 
195 200 205 

His Ser Gly Asp Ala lie Val Glu Tyr Leu Gin Gin Lys Tyr Ala Leu 
210 215 220 



Lys Lys Asn Ala His Ala Phe Pro Lys Val Glu Phe His Ala Ser Gly 
225 230 235 240 



Asp Val lie Trp Leu Glu Lys Gin Ala 
245 

<210> 9 
<211> 768 
<212> DNA 
<213> H. pylori 

<221> CDS 

<222> (1) . . (768) 

<400> 9 

atg aaa ata ggc gtt ttt gat age ggt gtg gga ggg ttt age gtt tta 
48 

Met Lys lie Gly Val Phe Asp Ser Gly Val Gly Gly Phe Ser Val Leu 
15 10 15 

aaa age ctt tta aaa gcg caa eta ttt gat gaa ate ate tat tat ggc 
96 

Lys Ser Leu Leu Lys Ala Gin Leu Phe Asp Glu lie lie Tyr Tyr Gly 
20 25 30 

gat agt get aga gtg cct tat ggc act aaa gac ccc acc acg ate aag 
144 

Asp Ser Ala Arg Val Pro Tyr Gly Thr Lys Asp Pro Thr Thr lie Lys 
35 40 45 

caa ttt ggc tta gag get ttg gat ttt ttc aaa ccg cac cag att gga 
192 

Gin Phe Gly Leu Glu Ala Leu Asp Phe Phe Lys Pro His Gin lie Gly 
50 55 60 

tta ttg att gtg gca tgc aac aca gcg age get eta get tta gaa gag 
240 

Leu Leu lie Val Ala Cys Asn Thr Ala Ser Ala Leu Ala Leu Glu Glu 

65 70 75 80 

atg caa aag cat tec aaa ate cct att gtg ggc gtg att gaa cca age 
288 

Met Gin Lys His Ser Lys lie Pro lie Val Gly Val lie Glu Pro Ser 
85 90 95 

att tta gcg ate aag caa caa gta aaa gat aaa aac gee tct att ttg 
336 

lie Leu Ala lie Lys Gin Gin Val Lys Asp Lys Asn Ala Ser lie Leu 
100 105 110 
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gtg eta ggg aca aaa gcg acg ate caa tec aac get tat gac aac gee 
384 

Val Leu Gly Thr Lys Ala Thr lie Gin Ser Asn Ala Tyr Asp Asn Ala 
115 120 125 

ctg aaa caa caa ggc tat ttg aat gtt teg cat tta gee act tct ctt 
432 

Leu Lys Gin Gin Gly Tyr Leu Asn Val Ser His Leu Ala Thr Ser Leu 

130 135 140 

ttt gtg cct ttg att gaa gaa agt att tta gag ggc gaa ttg eta gaa 
480 

Phe Val Pro Leu lie Glu Glu Ser lie Leu Glu Gly Glu Leu Leu Glu 
145 150 155 160 

act tgc atg cgt tat tat ttc act ccg tta gag ate ttg cct gaa gtg 
528 

Thr Cys Met Arg Tyr Tyr Phe Thr Pro Leu Glu lie Leu Pro Glu Val 
165 170 175 

gtt att tta ggt tgc acg cat ttt ccc tta ate get caa aaa att gag 
576 

Val lie Leu Gly Cys Thr His Phe Pro Leu lie Ala Gin Lys lie Glu 
180 185 190 

ggc tat ttt atg gag cat ttt gee ctt tea acg ccc ccc eta etc ate 
624 

Gly Tyr Phe Met Glu His Phe Ala Leu Ser Thr Pro Pro Leu Leu lie 
195 200 205 

cat teg ggc gat get att gtg gaa tat ttg cag caa aaa tac gee ctt 
672 

His Ser Gly Asp Ala lie Val Glu Tyr Leu Gin Gin Lys Tyr Ala Leu 
210 215 220 

aaa aaa aat gca cac gca ttc cct aaa gtg gaa ttt cat gcg agt ggc 
720 

Lys Lys Asn Ala His Ala Phe Pro Lys Val Glu Phe His Ala Ser Gly 
225 230 235 240 

gat gtg ate tgg eta gaa aaa cag get aaa gaa tgg etc aaa ttg taa 
768 

Asp Val lie Trp Leu Glu Lys Gin Ala Lys Glu Trp Leu Lys Leu 
245 250 255 

<210> 10 
<211> 255 
<212> PRT 
<213> H. pylori 

<400> 10 

Met Lys lie Gly Val Phe Asp Ser Gly Val Gly Gly Phe Ser Val Leu 
15 10 15 



Lys Ser Leu Leu Lys Ala Gin Leu Phe Asp Glu lie lie Tyr Tyr Gly 
20 25 30 
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Asp Ser Ala Arg Val Pro Tyr Gly Thr Lys Asp Pro Thr Thr lie Lys 
35 40 45 



Gin Phe Gly Leu Glu Ala Leu Asp Phe Phe Lys Pro His Gin lie Gly 
50 55 60 



Leu Leu lie Val Ala Cys Asn Thr Ala Ser Ala Leu Ala Leu Glu Glu 
65 70 75 80 



Met Gin Lys His Ser Lys He Pro He Val Gly Val He Glu Pro Ser 
85 90 95 



He Leu Ala He Lys Gin Gin Val Lys Asp Lys Asn Ala Ser He Leu 
100 105 110 



Val Leu Gly Thr Lys Ala Thr He Gin Ser Asn Ala Tyr Asp Asn Ala 
115 120 125 



Leu Lys Gin Gin Gly Tyr Leu Asn Val Ser His Leu Ala Thr Ser Leu 
130 135 140 



Phe Val Pro Leu He Glu Glu Ser He Leu Glu Gly Glu Leu Leu Glu 
145 150 155 160 



Thr Cys Met Arg Tyr Tyr Phe Thr Pro Leu Glu He Leu Pro Glu Val 
165 170 175 



Val He Leu Gly Cys Thr His Phe Pro Leu He Ala Gin Lys He Glu 
180 185 190 



Gly Tyr Phe Met Glu His Phe Ala Leu Ser Thr Pro Pro Leu Leu He 
195 200 205 



His Ser Gly Asp Ala He Val Glu Tyr Leu Gin Gin Lys Tyr Ala Leu 
210 215 220 



Lys Lys Asn Ala His Ala Phe Pro Lys Val Glu Phe His Ala Ser Gly 
225 230 235 240 



Asp Val He Trp Leu Glu Lys Gin Ala Lys Glu Trp Leu Lys Leu 
245 250 255 



<210> 11 
<211> 749 
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<212> DNA 
<213> H. pylori 

<221> CDS 

<222> (1) . . (747) 

<400> 11 

atg aaa ata ggc gtt ttt gat age ggt gtg gga ggg ttt age gtt tta 
48 

Met Lys He Gly Val Phe Asp Ser Gly Val Gly Gly Phe Ser Val Leu 
15 10 15 

aaa age ctt tta aaa gcg caa att ttt gat gaa ate ate tat tat ggc 
96 

Lys Ser Leu Leu Lys Ala Gin He Phe Asp Glu He He Tyr Tyr Gly 
20 25 30 

gat age get aga gtg cct tat ggc act aaa gac ccc acc acg ate aag 
144 

Asp Ser Ala Arg Val Pro Tyr Gly Thr Lys Asp Pro Thr Thr He Lys 
35 40 45 

caa ttt ggc tta gag get ttg gat ttt ttc aaa ccg cac cag att aaa 
192 

Gin Phe Gly Leu Glu Ala Leu Asp Phe Phe Lys Pro His Gin He Lys 

50 55 60 

tta ttg att gtg gca tgc aac aca gcg age get eta get tta gaa gag 
240 

Leu Leu He Val Ala Cys Asn Thr Ala Ser Ala Leu Ala Leu Glu Glu 
65 70 75 80 

atg caa aag cat tec aaa ate cct att gtg ggc gtg att gag cca age 
288 

Met Gin Lys His Ser Lys He Pro He Val Gly Val He Glu Pro Ser 
85 90 95 

att tta gcg ate aag caa caa gta aaa gat aaa aac gee cct att tta 
336 

He Leu Ala He Lys Gin Gin Val Lys Asp Lys Asn Ala Pro He Leu 
100 105 110 

gtg eta ggg aca aaa gcg acg att caa tct aac get tat gac aac gee 
384 

Val Leu Gly Thr Lys Ala Thr He Gin Ser Asn Ala Tyr Asp Asn Ala 
115 120 125 

eta aaa caa caa ggc tat ttg aac att teg cat tta gee act tct ctt 
432 

Leu Lys Gin Gin Gly Tyr Leu Asn He Ser His Leu Ala Thr Ser Leu 
130 135 140 

ttt gtg cct ttg att gaa gaa agt att tta gag ggc gaa ttg tta gag 
480 

Phe Val Pro Leu He Glu Glu Ser He Leu Glu Gly Glu Leu Leu Glu 

145 150 155 160 

act tgc atg cgt tat tat ttc act ccc tta aag att tta cct gaa gtg 
528 
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Thr Cys Met Arg Tyr Tyr Phe Thr Pro Leu Lys lie Leu Pro Glu Val 
165 170 175 

ate att tta ggt tgc acg cat ttt ccc ttg ate get caa aaa att gag 
576 

He He Leu Gly Cys Thr His Phe Pro Leu He Ala Gin Lys He Glu 
180 185 190 

ggc tat ttt atg gag cat ttt gec ctt cca ace ccc ccc eta etc ate 
624 

Gly Tyr Phe Met Glu His Phe Ala Leu Pro Thr Pro Pro Leu Leu He 
195 200 205 

cat teg ggc gat get att gta gaa tat ttg cag caa aaa tac ace ctt 
672 

His Ser Gly Asp Ala He Val Glu Tyr Leu Gin Gin Lys Tyr Thr Leu 
210 215 220 

aag aaa aat gca cac gca ttc cct aaa gtg gaa ttt cat gcg agt ggc 
720 

Lys Lys Asn Ala His Ala Phe Pro Lys Val Glu Phe His Ala Ser Gly 
225 230 235 240 

gat gtg gtt tgg eta gaa aaa cag get aa 
749 

Asp Val Val Trp Leu Glu Lys Gin Ala 
245 

<210> 12 
<211> 249 
<212> PRT 
<213> H. pylori 

<400> 12 

Met Lys He Gly Val Phe Asp Ser Gly Val Gly Gly Phe Ser Val Leu 
15 10 15 



Lys Ser Leu Leu Lys Ala Gin He Phe Asp Glu He He Tyr Tyr Gly 
20 25 30 



Asp Ser Ala Arg Val Pro Tyr Gly Thr Lys Asp Pro Thr Thr He Lys 
35 40 45 



Gin Phe Gly Leu Glu Ala Leu Asp Phe Phe Lys Pro His Gin He Lys 
50 55 60 



Leu Leu He Val Ala Cys Asn Thr Ala Ser Ala Leu Ala Leu Glu Glu 
65 70 75 80 



Met Gin Lys His Ser Lys He Pro He Val Gly Val He Glu Pro Ser 
85 90 95 
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lie Leu Ala lie Lys Gin Gin Val Lys Asp Lys Asn Ala Pro lie Leu 
100 105 110 



Val Leu Gly Thr Lys Ala Thr lie Gin Ser Asn Ala Tyr Asp Asn Ala 
115 120 125 



Leu Lys Gin Gin Gly Tyr Leu Asn lie Ser His Leu Ala Thr Ser Leu 
130 135 140 



Phe Val Pro Leu lie Glu Glu Ser lie Leu Glu Gly Glu Leu Leu Glu 
145 150 155 160 



Thr Cys Met Arg Tyr Tyr Phe Thr Pro Leu Lys lie Leu Pro Glu Val 
165 170 175 



lie lie Leu Gly Cys Thr His Phe Pro Leu lie Ala Gin Lys lie Glu 
180 185 190 



Gly Tyr Phe Met Glu His Phe Ala Leu Pro Thr Pro Pro Leu Leu lie 
195 200 205 



His Ser Gly Asp Ala He Val Glu Tyr Leu Gin Gin Lys Tyr Thr Leu 
210 215 220 



Lys Lys Asn Ala His Ala Phe Pro Lys Val Glu Phe His Ala Ser Gly 
225 230 235 240 



Asp Val Val Trp Leu Glu Lys Gin Ala 
245 

<210> 13 
<211> 768 
<212> DNA 
<213> H. pylori 

<221> CDS 

<222> (1) . . (768) 

<400> 13 

atg aaa ata ggc gtt ttt gat age ggt gtg gga ggg ttt age gtt tta 
48 

Met Lys He Gly Val Phe Asp Ser Gly Val Gly Gly Phe Ser Val Leu 
15 10 15 

aaa age ctt tta aaa gcg caa att ttt gat gaa ate ate tat tat ggc 
96 

Lys Ser Leu Leu Lys Ala Gin He Phe Asp Glu He He Tyr Tyr Gly 
20 25 30 
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gat agt get aga gtg cct tat ggc act aaa gac ccc acc acg ate aag 
144 

Asp Ser Ala Arg Val Pro Tyr Gly Thr Lys Asp Pro Thr Thr lie Lys 
35 40 45 

caa ttt ggc tta gag get ttg gat ttt ttc aaa ccg cac cag att gga 
192 

Gin Phe Gly Leu Glu Ala Leu Asp Phe Phe Lys Pro His Gin lie Gly 
50 55 60 

tta ttg att gtg gca tgc aac aca gcg age get eta get tta gaa gag 
240 

Leu Leu lie Val Ala Cys Asn Thr Ala Ser Ala Leu Ala Leu Glu Glu 

65 70 75 80 

atg caa aag cat tec aaa ate cct att gtg ggc gtg att gag cca age 
288 

Met Gin Lys His Ser Lys lie Pro lie Val Gly Val lie Glu Pro Ser 
85 90 95 

att tta gcg ate aaa caa caa gtg aaa gat aaa aac get cct att tta 
336 

lie Leu Ala lie Lys Gin Gin Val Lys Asp Lys Asn Ala Pro lie Leu 
100 105 110 

gtg eta ggg aca aaa gcg acg att caa tct aac get tac gat aac gee 
384 

Val Leu Gly Thr Lys Ala Thr lie Gin Ser Asn Ala Tyr Asp Asn Ala 
115 120 125 

ctg aaa caa caa ggc tat ttg aat gtt teg cat tta gee act tct ctt 
432 

Leu Lys Gin Gin Gly Tyr Leu Asn Val Ser His Leu Ala Thr Ser Leu 
130 135 140 

ttt gtg cct ttg att gaa gaa aat att tta gag ggc gaa ttg eta gaa 
480 

Phe Val Pro Leu lie Glu Glu Asn lie Leu Glu Gly Glu Leu Leu Glu 
145 150 155 160 

act tgc atg cgt tat tat ttc act ccc tta aag att tta cct gaa gtg 
528 

Thr Cys Met Arg Tyr Tyr Phe Thr Pro Leu Lys lie Leu Pro Glu Val 
165 170 175 

ate att tta ggt tgc acg cat ttt ccc ttg ate get caa aaa att gag 
576 

lie lie Leu Gly Cys Thr His Phe Pro Leu lie Ala Gin Lys lie Glu 
180 185 190 

ggc tat ttt atg gag cat ttt gee ctt tta acg ccc ccc eta etc ate 
624 

Gly Tyr Phe Met Glu His Phe Ala Leu Leu Thr Pro Pro Leu Leu lie 
195 200 205 

cat tct ggc gat get att gta gaa tat ttg caa caa aaa tac gee ctt 
672 

His Ser Gly Asp Ala lie Val Glu Tyr Leu Gin Gin Lys Tyr Ala Leu 
210 215 220 



- 157- 



ASZD-P01-007 



aag aaa aat gca cac tea ttc cct aaa gtg gaa ttt cat gcg age ggc 
720 

Lys Lys Asn Ala His Ser Phe Pro Lys Val Glu Phe His Ala Ser Gly 
225 230 235 240 

gat gtg ate tgg eta gaa aaa cag get aaa gaa tgg etc aaa ttg taa 
768 

Asp Val lie Trp Leu Glu Lys Gin Ala Lys Glu Trp Leu Lys Leu 
245 250 255 

<210> 14 
<211> 255 
<212> PRT 
<213> H. pylori 

<400> 14 

Met Lys He Gly Val Phe Asp Ser Gly Val Gly Gly Phe Ser Val Leu 
15 10 15 



Lys Ser Leu Leu Lys Ala Gin He Phe Asp Glu He He Tyr Tyr Gly 
20 25 30 



Asp Ser Ala Arg Val Pro Tyr Gly Thr Lys Asp Pro Thr Thr He Lys 
35 40 45 



Gin Phe Gly Leu Glu Ala Leu Asp Phe Phe Lys Pro His Gin He Gly 
50 55 60 



Leu Leu He Val Ala Cys Asn Thr Ala Ser Ala Leu Ala Leu Glu Glu 
65 70 75 80 



Met Gin Lys His Ser Lys He Pro He Val Gly Val He Glu Pro Ser 
85 90 95 



He Leu Ala He Lys Gin Gin Val Lys Asp Lys Asn Ala Pro lie Leu 
100 105 110 



Val Leu Gly Thr Lys Ala Thr He Gin Ser Asn Ala Tyr Asp Asn Ala 
115 120 125 



Leu Lys Gin Gin Gly Tyr Leu Asn Val Ser His Leu Ala Thr Ser Leu 
130 135 140 



Phe Val Pro Leu He Glu Glu Asn He Leu Glu Gly Glu Leu Leu Glu 
145 150 155 160 



Thr Cys Met Arg Tyr Tyr Phe Thr Pro Leu Lys He Leu Pro Glu Val 
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165 170 175 



lie lie Leu Gly Cys Thr His Phe Pro Leu lie Ala Gin Lys lie Glu 
180 185 190 



Gly Tyr Phe Met Glu His Phe Ala Leu Leu Thr. Pro Pro Leu Leu lie 
195 200 205 



His Ser Gly Asp Ala He Val Glu Tyr Leu Gin Gin Lys Tyr Ala Leu 
210 215 220 



Lys Lys Asn Ala His Ser Phe Pro Lys Val Glu Phe His Ala Ser Gly 
225 230 235 240 



Asp Val He Trp Leu Glu Lys Gin Ala Lys Glu Trp Leu Lys Leu 
245 250 255 

<210> 15 
<211> 768 
<212> DNA 
<213> H. pylori 

<221> CDS 

<222> (1) . . (768) 

<400> 15 

atg aaa ata ggc gtt ttt gat age ggt gtg gga ggg ttt age gtt tta 
48 

Met Lys He Gly Val Phe Asp Ser Gly Val Gly Gly Phe Ser Val Leu 
15 10 15 

aaa age ctt tta aaa gcg caa att ttt gat gaa ate ate tat tat ggc 
96 

Lys Ser Leu Leu Lys Ala Gin He Phe Asp Glu He He Tyr Tyr Gly 

20 25 30 

gat age get aga gtg cct tat ggc act aaa gac ccc ace acg ate aag 
144 

Asp Ser Ala Arg Val Pro Tyr Gly Thr Lys Asp Pro Thr Thr He Lys 
35 40 45 

caa ttt ggc tta gag get ttg gat ttt ttc aaa ccg cac cag att gaa 
192 

Gin Phe Gly Leu Glu Ala Leu Asp Phe Phe Lys Pro His Gin He Glu 
50 55 60 

tta ttg att gtg gca tgc aac aca gcg age get eta get tta gaa gag 
240 

Leu Leu He Val Ala Cys Asn Thr Ala Ser Ala Leu Ala Leu Glu Glu 
65 70 75 80 

atg caa aag cat tec aaa ate cct att gtg ggc gtg att gaa cca age 
288 

Met Gin Lys His Ser Lys He Pro He Val Gly Val He Glu Pro Ser 
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att tta gcg ate 
336 

lie Leu Ala lie 
100 

gtg eta ggg aca 
384 

Val Leu Gly Thr 
115 

ctg aaa caa caa 
432 

Leu Lys Gin Gin 
130 

ttt gtg cct ttg 
480 

Phe Val Pro Leu 
145 

act tgc atg cgt 
528 

Thr Cys Met Arg 



ate att tta ggt 
576 

lie He Leu Gly 
180 

ggc tat ttt atg 
624 

Gly Tyr Phe Met 
195 

cat teg ggc gat 
672 

His Ser Gly Asp 
210 

aag aaa aat gca 
720 

Lys Lys Asn Ala 
225 

gat gtg ate tgg 
768 

Asp Val He Trp 



<210> 16 

<211> 255 

<212> PRT 

<213> H . pylor 



85 

aaa caa caa gtg 
Lys Gin Gin Val 

aaa gcg acg att 

Lys Ala Thr He 
120 

ggc tat ttg aat 

Gly Tyr Leu Asn 
135 

att gaa gaa agt 

He Glu Glu Ser 
150 

tat tat ttc act 

Tyr Tyr Phe Thr 
165 

tgc acg cat ttt 
Cys Thr His Phe 

ggg cat ttt gee 

Gly His Phe Ala 
200 

get att gtg gga 

Ala He Val Gly 
215 

cac gca ttc cct 

His Ala Phe Pro 
230 

eta gaa aaa cag 

Leu Glu Lys Gin 
245 



90 

aaa gat aaa aac 

Lys Asp Lys Asn 
105 

caa tct aac get 
Gin Ser Asn Ala 

gtt teg cat tta 

Val Ser His Leu 
140 

att tta gag ggc 

He Leu Glu Gly 
155 

ccc tta aag att 

Pro Leu Lys He 
170 

ccc ttg ate get 

Pro Leu He Ala 
185 

ctt tea acg ccc 
Leu Ser Thr Pro 

tat ttg caa caa 

Tyr Leu Gin Gin 
220 

aaa gtg gaa ttt 

Lys Val Glu Phe 
235 

get aaa gaa tgg 

Ala Lys Glu Trp 
250 



95 

get cct att tta 

Ala Pro He Leu 
110 

tac gac aac gee 

Tyr Asp Asn Ala 
125 

gee act tct ctt 
Ala Thr Ser Leu 

gaa ttg eta gaa 

Glu Leu Leu Glu 
160 

tta ccc aaa gta 

Leu Pro Lys Val 
175 

cac caa att aag 

His Gin He Lys 
190 

ccc eta etc ate 

Pro Leu Leu He 
205 

aaa tac gee ctt 
Lys Tyr Ala Leu 

cat gcg age ggc 

His Ala Ser Gly 
240 

etc aaa ttg taa 

Leu Lys Leu 
255 



<400> 16 
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Met Lys lie Gly Val Phe Asp Ser Gly Val Gly Gly Phe Ser Val Leu 
15 10 15 



Lys Ser Leu Leu Lys Ala Gin lie Phe Asp Glu lie lie Tyr Tyr Gly 
20 25 30 



Asp Ser Ala Arg Val Pro Tyr Gly Thr Lys Asp Pro Thr Thr lie Lys 
35 40 45 



Gin Phe Gly Leu Glu Ala Leu Asp Phe Phe Lys Pro His Gin lie Glu 
50 55 60 



Leu Leu lie Val Ala Cys Asn Thr Ala Ser Ala Leu Ala Leu Glu Glu 
65 70 75 80 



Met Gin Lys His Ser Lys lie Pro lie Val Gly Val lie Glu Pro Ser 
85 90 95 



lie Leu Ala lie Lys Gin Gin Val Lys Asp Lys Asn Ala Pro lie Leu 
100 105 110 



Val Leu Gly Thr Lys Ala Thr lie Gin Ser Asn Ala Tyr Asp Asn Ala 
115 120 125 



Leu Lys Gin Gin Gly Tyr Leu Asn Val Ser His Leu Ala Thr Ser Leu 
130 135 140 



Phe Val Pro Leu lie Glu Glu Ser lie Leu Glu Gly Glu Leu Leu Glu 
145 150 155 160 



Thr Cys Met Arg Tyr Tyr Phe Thr Pro Leu Lys lie Leu Pro Lys Val 
165 170 175 



lie lie Leu Gly Cys Thr His Phe Pro Leu lie Ala His Gin lie Lys 
180 185 190 



Gly Tyr Phe Met Gly His Phe Ala Leu Ser Thr Pro Pro Leu Leu lie 
195 200 205 



His Ser Gly Asp Ala lie Val Gly Tyr Leu Gin Gin Lys Tyr Ala Leu 
210 215 220 



Lys Lys Asn Ala His Ala Phe Pro Lys Val Glu Phe His Ala Ser Gly 
225 230 235 240 
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Asp Val lie Trp Leu Glu Lys Gin Ala Lys Glu Trp Leu Lys Leu 
245 250 255 

<210> 17 
<211> 768 
<212> DNA 
<213> H. pylori 

<221> CDS 

<222> (1) . . (768) 

<400> 17 

atg aaa ata ggc gtt ttt gat age ggt gtg gga ggg ttt age gtt tta 
48 

Met Lys He Gly Val Phe Asp Ser Gly Val Gly Gly Phe Ser Val Leu 
15 10 15 

aaa age ctt tta aaa gcg caa tta ttt gat gaa ate ate tat tat ggc 
96 

Lys Ser Leu Leu Lys Ala Gin Leu Phe Asp Glu He He Tyr Tyr Gly 
20 25 30 

gat age get aga gtg cct tat ggc act aaa gac ccc ace acg ate aag 
144 

Asp Ser Ala Arg Val Pro Tyr Gly Thr Lys Asp Pro Thr Thr He Lys 
35 40 45 

caa ttt ggc tta gag get ttg gat ttt ttc aaa ccg cac cag att aaa 
192 

Gin Phe Gly Leu Glu Ala Leu Asp Phe Phe Lys Pro His Gin He Lys 
50 55 60 

tta ttg att gtg gca tgc aac aca gcg agt get ctg get tta gaa gag 
240 

Leu Leu He Val Ala Cys Asn Thr Ala Ser Ala Leu Ala Leu Glu Glu 
65 70 75 80 

atg caa aag cat tec aaa ate cct att gtg ggc gtg att gag cca age 
288 

Met Gin Lys His Ser Lys He Pro He Val Gly Val He Glu Pro Ser 
85 90 95 

att tta gcg ate aaa caa cag gta aaa gat aaa aac gee ccc att tta 
336 

He Leu Ala He Lys Gin Gin Val Lys Asp Lys Asn Ala Pro He Leu 

100 105 110 

gtg eta ggc aca aaa gcg acg att caa tct aac get tac gat aac get 
384 

Val Leu Gly Thr Lys Ala Thr He Gin Ser Asn Ala Tyr Asp Asn Ala 
115 120 125 

ctg aaa cga caa ggc tat ttg aac gtt teg cat tta gee act tec ctt 
432 

Leu Lys Arg Gin Gly Tyr Leu Asn Val Ser His Leu Ala Thr Ser Leu 
130 135 140 
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ttt gtg cct ttg att gaa gaa agt att tta gag ggc gaa ttg tta gaa 
480 

Phe Val Pro Leu He Glu Glu Ser He Leu Glu Gly Glu Leu Leu Glu 
145 150 155 160 

act tgc atg cgt tat tat ttc act ccc tta aag att tta cct gaa gtg 
528 

Thr Cys Met Arg Tyr Tyr Phe Thr Pro Leu Lys He Leu Pro Glu Val 
165 170 175 

ate att tta ggt tgt acg cat ttt ccc ttg ate get caa aaa att gag 
576 

He He Leu Gly Cys Thr His Phe Pro Leu He Ala Gin Lys He Glu 
180 185 190 

ggc tat ttt atg gaa cat ttt gec ttt cca acg ccc ccc eta etc ate 
624 

Gly Tyr Phe Met Glu His Phe Ala Phe Pro Thr Pro Pro Leu Leu He 
195 200 205 

cat teg ggc gat get att gtg gaa tat ttg cag caa aaa tac gee ctt 
672 

His Ser Gly Asp Ala He Val Glu Tyr Leu Gin Gin Lys Tyr Ala Leu 
210 215 220 

aag aaa aat gca cac gca tta cct aaa gtg gaa ttt cat gcg age ggc 
720 

Lys Lys Asn Ala His Ala Leu Pro Lys Val Glu Phe His Ala Ser Gly 
225 230 235 240 

gat gtg ate tgg eta gaa aaa caa get aaa gaa tgg etc aaa ttg taa 
768 

Asp Val He Trp Leu Glu Lys Gin Ala Lys Glu Trp Leu Lys Leu 
245 250 255 

<210> 18 
<211> 255 
<212> PRT 
<213> H. pylori 

<400> 18 

Met Lys He Gly Val Phe Asp Ser Gly Val Gly Gly Phe Ser Val Leu 
15 10 15 



Lys Ser Leu Leu Lys Ala Gin Leu Phe Asp Glu He He Tyr Tyr Gly 
20 25 30 



Asp Ser Ala Arg Val Pro Tyr Gly Thr Lys Asp Pro Thr Thr He Lys 
35 40 45 



Gin Phe Gly Leu Glu Ala Leu Asp Phe Phe Lys Pro His Gin He Lys 
50 55 60 



Leu Leu He Val Ala Cys Asn Thr Ala Ser Ala Leu Ala Leu Glu Glu 
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65 70 75 80 



Met Gin Lys His Ser Lys lie Pro lie Val Gly Val lie Glu Pro Ser 
85 90 95 



lie Leu Ala lie Lys Gin Gin Val Lys Asp Lys Asn Ala Pro lie Leu 
100 105 110 



Val Leu Gly Thr Lys Ala Thr lie Gin Ser Asn Ala Tyr Asp Asn Ala 
115 120 125 



Leu Lys Arg Gin Gly Tyr Leu Asn Val Ser His Leu Ala Thr Ser Leu 
130 135 140 



Phe Val Pro Leu lie Glu Glu Ser lie Leu Glu Gly Glu Leu Leu Glu 
145 150 155 160 



Thr Cys Met Arg Tyr Tyr Phe Thr Pro Leu Lys lie Leu Pro Glu Val 
165 170 175 



lie lie Leu Gly Cys Thr His Phe Pro Leu lie Ala Gin Lys lie Glu 
180 185 190 



Gly Tyr Phe Met Glu His Phe Ala Phe Pro Thr Pro Pro Leu Leu lie 
195 200 205 



His Ser Gly Asp Ala lie Val Glu Tyr Leu Gin Gin Lys Tyr Ala Leu 
210 215 220 



Lys Lys Asn Ala His Ala Leu Pro Lys Val Glu Phe His Ala Ser Gly 
225 230 235 240 



Asp Val lie Trp Leu Glu Lys Gin Ala Lys Glu Trp Leu Lys Leu 
245 250 255 

<210> 19 
<211> 768 
<212> DNA 
<213> H. pylori 

<221> CDS 

<222> (1) . . (768) 

<400> 19 

atg aaa ata ggc gtt ttt gat age ggt gtg gga ggg ttt age gtt tta 
48 

Met Lys He Gly Val Phe Asp Ser Gly Val Gly Gly Phe Ser Val Leu 
15 10 15 
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aaa age ctt tta 
96 

Lys Ser Leu Leu 
20 

gat age get aga 
144 

Asp Ser Ala Arg 
35 

caa ttt ggc tta 
192 

Gin Phe Gly Leu 
50 

tta ttg att gtg 
240 

Leu Leu lie Val 
65 

atg caa aag cat 
288 

Met Gin Lys His 



att tta gcg ate 
336 

lie Leu Ala lie 
100 

gtg eta ggg aca 
384 

Val Leu Gly Thr 
115 

ctg aaa caa caa 
432 

Leu Lys Gin Gin 
130 

ttt gtg cct ttg 
480 

Phe Val Pro Leu 
145 

act tgc atg cgt 
528 

Thr Cys Met Arg 



ate att tta ggt 
576 

He He Leu Gly 
180 

ggc tat ttc atg 
624 

Gly Tyr Phe Met 



aaa gcg caa tta 
Lys Ala Gin Leu 

gtg cct tat ggc 

Val Pro Tyr Gly 
40 

gag get ttg gat 

Glu Ala Leu Asp 
55 

gca tgc aac aca 

Ala Cys Asn Thr 
70 

tec aaa ate cct 

Ser Lys He Pro 
85 

aaa caa caa gta 
Lys Gin Gin Val 

aaa gcg acg att 

Lys Ala Thr He 
120 

ggc tat ttg aac 

Gly Tyr Leu Asn 
135 

att gaa gaa aat 

He Glu Glu Asn 
150 

tat tat ttc act 

Tyr Tyr Phe Thr 
165 

tgc acg cat ttt 
Cys Thr His Phe 

ggg cat ttt gee 
Gly His Phe Ala 



ttt gat gaa ate 

Phe Asp Glu He 
25 

act aaa gac ccc 
Thr Lys Asp Pro 

ttt ttc aaa ccg 

Phe Phe Lys Pro 
60 

gcg age get eta 

Ala Ser Ala Leu 
75 

att gtg ggc gtg 

He Val Gly Val 
90 

aag gat aaa aac 

Lys Asp Lys Asn 
105 

caa tct aac get 
Gin Ser Asn Ala 

gtt teg cat tta 

Val Ser His Leu 
140 

att tta gag ggc 

He Leu Glu Gly 
155 

ccc tta gag att 

Pro Leu Glu He 
170 

ccc tta ate get 

Pro Leu He Ala 
185 

ctt cca acg ccc 
Leu Pro Thr Pro 



ate tat tat ggc 

He Tyr Tyr Gly 
30 

ace acg ate aag 

Thr Thr He Lys 
45 

cac cag att aaa 
His Gin He Lys 

get tta gaa gag 

Ala Leu Glu Glu 
80 

att gag cca age 

He Glu Pro Ser 
95 

gee ccc att tta 

Ala Pro He Leu 
110 

tac gat aac get 

Tyr Asp Asn Ala 
125 

gec act tct ctt 
Ala Thr Ser Leu 

gaa ttg tta gaa 

Glu Leu Leu Glu 
160 

tta cct gaa gtg 

Leu Pro Glu Val 
175 

caa aaa att gag 

Gin Lys He Glu 
190 

ccc ata etc ate 
Pro He Leu He 
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195 200 205 

cat tct ggc gac get att gta gaa tat ttg caa caa aaa tac gec ctt 
672 

His Ser Gly Asp Ala lie Val Glu Tyr Leu Gin Gin Lys Tyr Ala Leu 
210 215 220 

aag aaa aat gca cac gca ttc cct aaa gtg gaa ttt cat gcg age ggc 
720 

Lys Lys Asn Ala His Ala Phe Pro Lys Val Glu Phe His Ala Ser Gly 
225 230 235 240 

gat atg ate tgg eta gaa aaa caa get aaa gaa tgg etc aaa ttg taa 
768 

Asp Met lie Trp Leu Glu Lys Gin Ala Lys Glu Trp Leu Lys Leu 
245 250 255 

<210> 20 
<211> 255 
<212> PRT 
<213> H. pylori 

<400> 20 

Met Lys lie Gly Val Phe Asp Ser Gly Val Gly Gly Phe Ser Val Leu 
15 10 15 



Lys Ser Leu Leu Lys Ala Gin Leu Phe Asp Glu lie lie Tyr Tyr Gly 
20 25 30 



Asp Ser Ala Arg Val Pro Tyr Gly Thr Lys Asp Pro Thr Thr lie Lys 
35 40 45 



Gin Phe Gly Leu Glu Ala Leu Asp Phe Phe Lys Pro His Gin lie Lys 
50 55 60 



Leu Leu lie Val Ala Cys Asn Thr Ala Ser Ala Leu Ala Leu Glu Glu 
65 70 75 80 



Met Gin Lys His Ser Lys He Pro He Val Gly Val He Glu Pro Ser 
85 90 95 



He Leu Ala He Lys Gin Gin Val Lys Asp Lys Asn Ala Pro He Leu 
100 105 110 



Val Leu Gly Thr Lys Ala Thr He Gin Ser Asn Ala Tyr Asp Asn Ala 
115 120 125 



Leu Lys Gin Gin Gly Tyr Leu Asn Val Ser His Leu Ala Thr Ser Leu 
130 135 140 
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Phe Val Pro Leu lie Glu Glu Asn lie Leu Glu Gly Glu Leu Leu Glu 
145 150 155 160 



Thr Cys Met Arg Tyr Tyr Phe Thr Pro Leu Glu lie Leu Pro Glu Val 
165 170 175 



lie lie Leu Gly Cys Thr His Phe Pro Leu He Ala Gin Lys He Glu 
180 185 190 



Gly Tyr Phe Met Gly His Phe Ala Leu Pro Thr Pro Pro He Leu He 
195 200 205 



His Ser Gly Asp Ala He Val Glu Tyr Leu Gin Gin Lys Tyr Ala Leu 
210 215 220 



Lys Lys Asn Ala His Ala Phe Pro Lys Val Glu Phe His Ala Ser Gly 
225 230 235 240 



Asp Met He Trp Leu Glu Lys Gin Ala Lys Glu Trp Leu Lys Leu 
245 250 255 

<210> 21 
<211> 768 
<212> DNA 
<213> H. pylori 

<221> CDS 

<222> (1) . . (768) 

<400> 21 

atg aaa ata ggc gtt ttt gat age ggt gtg gga ggg ttt age gtt tta 
48 

Met Lys He Gly Val Phe Asp Ser Gly Val Gly Gly Phe Ser Val Leu 
15 10 15 

aaa age ctt tta aaa gcg caa tta ttt gat gaa ate ate tat tat ggc 
96 

Lys Ser Leu Leu Lys Ala Gin Leu Phe Asp Glu He He Tyr Tyr Gly 
20 25 30 

gat age get aga gtg cct tat ggc act aaa gac ccc acc acg ate aag 
144 

Asp Ser Ala Arg Val Pro Tyr Gly Thr Lys Asp Pro Thr Thr He Lys 
35 40 45 

caa ttt ggc tta gag get ttg gat ttt ttc aaa ccg cac cag att aaa 
192 

Gin Phe Gly Leu Glu Ala Leu Asp Phe Phe Lys Pro His Gin He Lys 

50 55 60 



tta ttg att gta gca tgc aac aca gcg age get eta get tta gaa gag 
240 
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Leu Leu lie Val Ala Cys Asn Thr Ala Ser Ala Leu Ala Leu Glu Glu 
65 70 75 80 

atg caa aag cat tec aaa ate cct att gtg ggc gtg att gag cca age 
5 288 

Met Gin Lys His Ser Lys lie Pro lie Val Gly Val lie Glu Pro Ser 
85 90 95 

att tta gcg ate aaa caa caa gta aaa gat aaa aac gee cct att tta 
10 336 

lie Leu Ala lie Lys Gin Gin Val Lys Asp Lys Asn Ala Pro lie Leu 
100 105 110 

gtg eta ggg aca aaa gcg acg att caa tct aac get tat gac aac gee 
15 384 

Val Leu Gly Thr Lys Ala Thr lie Gin Ser Asn Ala Tyr Asp Asn Ala 
115 120 125 

ctg aaa caa caa ggc tat ttg aat gtt teg cat tta gee act tct ctt 
20 432 

Leu Lys Gin Gin Gly Tyr Leu Asn Val Ser His Leu Ala Thr Ser Leu 
130 135 140 

ttt gtg cct ttg att gaa gaa agt att tta gag ggc gaa ttg tta gaa 
25 480 

Phe Val Pro Leu lie Glu Glu Ser lie Leu Glu Gly Glu Leu Leu Glu 
145 150 155 160 

act tgc atg cgt tat tat ttc act ccc tta aag att tta cct gaa gtg 
30 528 

Thr Cys Met Arg Tyr Tyr Phe Thr Pro Leu Lys lie Leu Pro Glu Val 
165 170 175 

att att tta ggt tgc acg cat ttt ccc ttg ate get caa aaa att gag 
35 576 

lie lie Leu Gly Cys Thr His Phe Pro Leu lie Ala Gin Lys lie Glu 
180 185 190 

age tat ttt atg ggg cat ttt gee ctt cca acg ccc ccc eta etc ate 
40 624 

Ser Tyr Phe Met Gly His Phe Ala Leu Pro Thr Pro Pro Leu Leu lie 
195 200 205 

cat tct ggc gat get att gtg gaa tat ttg cag caa aaa tac gee ctt 
45 672 

His Ser Gly Asp Ala lie Val Glu Tyr Leu Gin Gin Lys Tyr Ala Leu 
210 215 220 

aag aaa aac gca cac gca ttc cct aaa gtg gaa ttt cat gcg age ggc 
50 720 

Lys Lys Asn Ala His Ala Phe Pro Lys Val Glu Phe His Ala Ser Gly 

225 230 235 240 

gat gtg ate tgg eta gaa aaa caa get aaa gaa tgg etc aaa ttg taa 
55 768 

Asp Val lie Trp Leu Glu Lys Gin Ala Lys Glu Trp Leu Lys Leu 
245 250 255 



<210> 22 
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<211> 255 
<212> PRT 
<213> H. pylori 

<400> 22 

Met Lys lie Gly Val Phe Asp Ser Gly Val Gly Gly Phe Ser Val Leu 
15 10 15 



Lys Ser Leu Leu Lys Ala Gin Leu Phe Asp Glu lie lie Tyr Tyr Gly 
20 25 30 



Asp Ser Ala Arg Val Pro Tyr Gly Thr Lys Asp Pro Thr Thr lie Lys 
35 40 45 



Gin Phe Gly Leu Glu Ala Leu Asp Phe Phe Lys Pro His Gin lie Lys 
50 55 60 



Leu Leu lie Val Ala Cys Asn Thr Ala Ser Ala Leu Ala Leu Glu Glu 
65 70 75 80 



Met Gin Lys His Ser Lys lie Pro lie Val Gly Val lie Glu Pro Ser 
85 90 95 



lie Leu Ala lie Lys Gin Gin Val Lys Asp Lys Asn Ala Pro lie Leu 
100 105 110 



Val Leu Gly Thr Lys Ala Thr lie Gin Ser Asn Ala Tyr Asp Asn Ala 
115 120 125 



Leu Lys Gin Gin Gly Tyr Leu Asn Val Ser His Leu Ala Thr Ser Leu 
130 135 140 



Phe Val Pro Leu He Glu Glu Ser He Leu Glu Gly Glu Leu Leu Glu 
145 150 155 160 



Thr Cys Met Arg Tyr Tyr Phe Thr Pro Leu Lys He Leu Pro Glu Val 
165 170 175 



He He Leu Gly Cys Thr His Phe Pro Leu He Ala Gin Lys He Glu 
180 185 190 



Ser Tyr Phe Met Gly His Phe Ala Leu Pro Thr Pro Pro Leu Leu He 
195 200 205 



His Ser Gly Asp Ala He Val Glu Tyr Leu Gin Gin Lys Tyr Ala Leu 
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210 215 220 



Lys Lys Asn Ala His Ala Phe Pro Lys Val Glu Phe His Ala Ser Gly 
225 230 235 240 

Asp Val lie Trp Leu Glu Lys Gin Ala Lys Glu Trp Leu Lys Leu 
245 250 255 

<210> 23 
<211> 768 
<212> DNA 
<213> H. pylori 



<221> CDS 

<222> (1) . . (768) 

<400> 23 

atg aaa ata ggc gtt ttt gat age ggt gtg gga ggg ttt age gtt tta 
48 

Met Lys lie Gly Val Phe Asp Ser Gly Val Gly Gly Phe Ser Val Leu 
15 10 15 

aaa age ctt tta aaa gcg caa eta ttt gat gaa ate ate tat tat ggc 
96 

Lys Ser Leu Leu Lys Ala Gin Leu Phe Asp Glu lie lie Tyr Tyr Gly 
20 25 30 

gat age get aga gtg cct tat ggc act aaa gac ccc acc acg ate aag 
144 

Asp Ser Ala Arg Val Pro Tyr Gly Thr Lys Asp Pro Thr Thr lie Lys 
35 40 45 

caa ttt ggc tta gag get ttg gat ttt ttc aaa ccg cac cag att gga 
192 

Gin Phe Gly Leu Glu Ala Leu Asp Phe Phe Lys Pro His Gin lie Gly 
50 55 60 

tta ttg att gtg gca tgc aac aca gcg age get ctg get tta gaa gag 
240 

Leu Leu lie Val Ala Cys Asn Thr Ala Ser Ala Leu Ala Leu Glu Glu 
65 70 75 80 

atg caa aaa tat tec aaa ate cct att gtg ggc gtg att gag cca age 
288 

Met Gin Lys Tyr Ser Lys lie Pro lie Val Gly Val lie Glu Pro Ser 
85 90 95 

att tta gcg ate aaa caa caa gta aaa gat aaa aac gec ccc att tta 
336 

lie Leu Ala lie Lys Gin Gin Val Lys Asp Lys Asn Ala Pro lie Leu 
100 105 110 

gtg eta ggg aca aaa gcg acg ate caa tct aac get tat gat aac gec 
384 

Val Leu Gly Thr Lys Ala Thr lie Gin Ser Asn Ala Tyr Asp Asn Ala 
115 120 125 
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ctg aaa caa caa ggc tat ttg aac att teg cat tta gec act tct ctt 
432 

Leu Lys Gin Gin Gly Tyr Leu Asn lie Ser His Leu Ala Thr Ser Leu 

130 135 140 

ttt gtg ccc ttg att gaa gaa agt att tta gag ggc gaa ttg tta gaa 
480 

Phe Val Pro Leu lie Glu Glu Ser lie Leu Glu Gly Glu Leu Leu Glu 
145 150 155 160 

act tgc atg cgt tat tat ttc act cca tta gag att tta cct gaa gtg 
528 

Thr Cys Met Arg Tyr Tyr Phe Thr Pro Leu Glu lie Leu Pro Glu Val 
165 170 175 

ate att tta ggt tgc acg cat ttt ccc ttg ate get caa aaa att gag 
576 

lie lie Leu Gly Cys Thr His Phe Pro Leu lie Ala Gin Lys lie Glu 
180 185 190 

age tat ttt atg gag cat ttt gee ctt tea acg ccc ccc tta etc ate 
624 

Ser Tyr Phe Met Glu His Phe Ala Leu Ser Thr Pro Pro Leu Leu lie 
195 200 205 

cat tct ggc gat get att gtg gaa tac ttg caa caa aaa tac gee ctt 
672 

His Ser Gly Asp Ala lie Val Glu Tyr Leu Gin Gin Lys Tyr Ala Leu 
210 215 220 

aag aaa aac gca cac gca ttc cct aaa gtg gaa ttt cat gcg age ggc 
720 

Lys Lys Asn Ala His Ala Phe Pro Lys Val Glu Phe His Ala Ser Gly 

225 230 235 240 

gat gtg ate tgg eta gaa aaa cag get aaa gaa tgg etc aaa ttg taa 
768 

Asp Val lie Trp Leu Glu Lys Gin Ala Lys Glu Trp Leu Lys Leu 
245 250 255 

<210> 24 
<211> 255 
<212> PRT 
<213> H. pylori 

<400> 24 

Met Lys He Gly Val Phe Asp Ser Gly Val Gly Gly Phe Ser Val Leu 
15 10 15 



Lys Ser Leu Leu Lys Ala Gin Leu Phe Asp Glu He He Tyr Tyr Gly 
20 25 30 



Asp Ser Ala Arg Val Pro Tyr Gly Thr Lys Asp Pro Thr Thr He Lys 
35 40 45 
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Gin Phe Gly Leu Glu Ala Leu Asp Phe Phe Lys Pro His Gin lie Gly 
50 55 60 



Leu Leu lie Val Ala Cys Asn Thr Ala Ser Ala Leu Ala Leu Glu Glu 
65 70 75 80 



Met Gin Lys Tyr Ser Lys lie Pro lie Val Gly Val lie Glu Pro Ser 
85 90 95 



lie Leu Ala lie Lys Gin Gin Val Lys Asp Lys Asn Ala Pro lie Leu 
100 105 110 



Val Leu Gly Thr Lys Ala Thr lie Gin Ser Asn Ala Tyr Asp Asn Ala 
115 120 125 



Leu Lys Gin Gin Gly Tyr Leu Asn lie Ser His Leu Ala Thr Ser Leu 
130 135 140 



Phe Val Pro Leu lie Glu Glu Ser lie Leu Glu Gly Glu Leu Leu Glu 
145 150 155 160 



Thr Cys Met Arg Tyr Tyr Phe Thr Pro Leu Glu lie Leu Pro Glu Val 
165 170 175 



lie lie Leu Gly Cys Thr His Phe Pro Leu lie Ala Gin Lys lie Glu 
180 185 190 



Ser Tyr Phe Met Glu His Phe Ala Leu Ser Thr Pro Pro Leu Leu lie 
195 200 205 



His Ser Gly Asp Ala lie Val Glu Tyr Leu Gin Gin Lys Tyr Ala Leu 
210 215 220 



Lys Lys Asn Ala His Ala Phe Pro Lys Val Glu Phe His Ala Ser Gly 
225 230 235 240 



Asp Val lie Trp Leu Glu Lys Gin Ala Lys Glu Trp Leu Lys Leu 
245 250 255 

<210> 25 
<211> 768 
<212> DNA 
<213> H. pylori 



<221> CDS 

<222> (1) . • (768) 
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<400> 25 

atg aaa ata ggc 

48 

Met Lys lie Gly 
1 

aaa age ctt tta 
96 

Lys Ser Leu Leu 
20 

gat age get aga 
144 

Asp Ser Ala Arg 
35 

caa ttt ggc tta 
192 

Gin Phe Gly Leu 
50 

tta tta att gtg 
240 

Leu Leu lie Val 
65 

atg caa aag cat 
288 

Met Gin Lys His 



att tta gcg ate 
336 

He Leu Ala He 
100 

gtg eta ggg aca 
384 

Val Leu Gly Thr 
115 

ctg aaa caa caa 
432 

Leu Lys Gin Gin 
130 

ttt gtg cct ttg 
480 

Phe Val Pro Leu 
145 

act tgc atg cgt 
528 

Thr Cys Met Arg 



gtt att tta ggc 
576 



gtt ttt gat age 

Val Phe Asp Ser 
5 

aaa gcg caa tta 
Lys Ala Gin Leu 

gtg cct tat ggc 

Val Pro Tyr Gly 
40 

gag get ttg gat 

Glu Ala Leu Asp 
55 

gca tgc aac aca 

Ala Cys Asn Thr 
70 

tec aaa ate ccc 

Ser Lys He Pro 
85 

aaa caa caa gtg 
Lys Gin Gin Val 

aaa gcg acg ate 

Lys Ala Thr He 
120 

ggc tat ttg aag 

Gly Tyr Leu Lys 
135 

att gaa gaa agt 

He Glu Glu Ser 
150 

tat tat ttc act 

Tyr Tyr Phe Thr 
165 

tgc acg cat ttt 



ggt gtg gga ggg 

Gly Val Gly Gly 
10 

ttt gat gaa ate 

Phe Asp Glu He 
25 

act aaa gac ccc 
Thr Lys Asp Pro 

ttt ttc aaa ccg 

Phe Phe Lys Pro 
60 

gcg age get ctg 

Ala Ser Ala Leu 
75 

att gtg ggc gtg 

He Val Gly Val 
90 

aaa gat aaa aac 

Lys Asp Lys Asn 
105 

caa tct aac get 
Gin Ser Asn Ala 

gtt teg cat ttg 

Val Ser His Leu 
140 

att tta gag ggc 

He Leu Glu Gly 
155 

cca tta gaa ate 

Pro Leu Glu He 
170 

ccc ttg ate get 



ttt age gtt tta 

Phe Ser Val Leu 
15 

ate tat tat ggc 

He Tyr Tyr Gly 
30 

ace acg ate aag 

Thr Thr He Lys 
45 

cac aaa att gaa 
His Lys He Glu 

get tta gaa gag 

Ala Leu Glu Glu 
80 

att gag cca age 

He Glu Pro Ser 
95 

acc cct att tta 

Thr Pro He Leu 
110 

tac gat aac gee 

Tyr Asp Asn Ala 
125 

gee act tct ctt 
Ala Thr Ser Leu 

gaa ttg tta gaa 

Glu Leu Leu Glu 
160 

tta cct gaa gtg 

Leu Pro Glu Val 
175 

caa aaa att gag 
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Val lie Leu Gly Cys Thr His Phe Pro Leu lie Ala Gin Lys lie Glu 
180 185 190 

ggc tat ttt atg gaa cat ttt gcc ctt cca acg ccc ccc eta etc ate 
624 

Gly Tyr Phe Met Glu His Phe Ala Leu Pro Thr Pro Pro Leu Leu lie 
195 200 205 

cat tct ggc gac get att gtg gga tat ttg cag caa aaa tac gcc ctt 
672 

His Ser Gly Asp Ala lie Val Gly Tyr Leu Gin Gin Lys Tyr Ala Leu 
210 215 220 

aag aaa aac gca cac gca ttc cct aaa gtg gaa ttt cat gcg age ggc 
720 

Lys Lys Asn Ala His Ala Phe Pro Lys Val Glu Phe His Ala Ser Gly 

225 230 235 240 

gat gta att tgg eta gaa aaa cag get aaa gaa tgg etc aaa ttg taa 
768 

Asp Val lie Trp Leu Glu Lys Gin Ala Lys Glu Trp Leu Lys Leu 
245 250 255 

<210> 26 
<211> 255 
<212> PRT 
<213> H. pylori 

<400> 26 

Met Lys lie Gly Val Phe Asp Ser Gly Val Gly Gly Phe Ser Val Leu 
15 10 15 



Lys Ser Leu Leu Lys Ala Gin Leu Phe Asp Glu lie lie Tyr Tyr Gly 
20 25 30 



Asp Ser Ala Arg Val Pro Tyr Gly Thr Lys Asp Pro Thr Thr lie Lys 
35 40 45 



Gin Phe Gly Leu Glu Ala Leu Asp Phe Phe Lys Pro His Lys lie Glu 
50 55 60 



Leu Leu lie Val Ala Cys Asn Thr Ala Ser Ala Leu Ala Leu Glu Glu 
65 70 75 80 



Met Gin Lys His Ser Lys lie Pro lie Val Gly Val lie Glu Pro Ser 
85 90 95 



lie Leu Ala lie Lys Gin Gin Val Lys Asp Lys Asn Thr Pro lie Leu 
100 105 110 



Val Leu Gly Thr Lys Ala Thr lie Gin Ser Asn Ala Tyr Asp Asn Ala 
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115 120 125 



Leu Lys Gin Gin Gly Tyr Leu Lys Val Ser His Leu Ala Thr Ser Leu 
130 135 140 



Phe Val Pro Leu lie Glu Glu Ser lie Leu Glu Gly Glu Leu Leu Glu 
145 150 155 160 



Thr Cys Met Arg Tyr Tyr Phe Thr Pro Leu Glu lie Leu Pro Glu Val 
165 170 175 



Val lie Leu Gly Cys Thr His Phe Pro Leu lie Ala Gin Lys lie Glu 
180 185 190 



Gly Tyr Phe Met Glu His Phe Ala Leu Pro Thr Pro Pro Leu Leu lie 
195 200 205 



His Ser Gly Asp Ala lie Val Gly Tyr Leu Gin Gin Lys Tyr Ala Leu 
210 215 220 



Lys Lys Asn Ala His Ala Phe Pro Lys Val Glu Phe His Ala Ser Gly 
225 230 235 240 



Asp Val lie Trp Leu Glu Lys Gin Ala Lys Glu Trp Leu Lys Leu 
245 250 255 

<210> 27 
<211> 768 
<212> DNA 
<213> H. pylori 

<221> CDS 

<222> (1) . . (768) 

<400> 27 

atg aaa ata ggc gtt ttt gat age ggt gtg gga ggg ttt age gtt tta 
48 

Met Lys lie Gly Val Phe Asp Ser Gly Val Gly Gly Phe Ser Val Leu 
15 10 15 

aaa age ctt tta aaa gcg caa att ttt gat gaa ate ate tat tat ggc 
96 

Lys Ser Leu Leu Lys Ala Gin lie Phe Asp Glu lie lie Tyr Tyr Gly 

20 25 30 

gat age get agg gtg cct tat ggc act aaa gac ccc acc acg ate aag 
144 

Asp Ser Ala Arg Val Pro Tyr Gly Thr Lys Asp Pro Thr Thr lie Lys 
35 40 45 
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caa ttt ggc tta gag get ttg gat ttt ttc aaa ccg cac aag att gaa 
192 

Gin Phe Gly Leu Glu Ala Leu Asp Phe Phe Lys Pro His Lys lie Glu 
50 55 60 

tta ttg att gtg gca tgc aac aca gcg age get eta get tta gaa gaa 
240 

Leu Leu lie Val Ala Cys Asn Thr Ala Ser Ala Leu Ala Leu Glu Glu 
65 70 75 80 

atg caa aag cat tec aaa ate cct att gtg ggc gtg att gaa cca age 
288 

Met Gin Lys His Ser Lys lie Pro lie Val Gly Val lie Glu Pro Ser 
85 90 95 

att tta gcg ate aaa caa caa gta aaa gat aaa aac gee cct att tta 
336 

lie Leu Ala lie Lys Gin Gin Val Lys Asp Lys Asn Ala Pro lie Leu 
100 105 110 

gtg eta ggg aca aaa gcg acg att caa tct aac get tat gac aac gee 
384 

Val Leu Gly Thr Lys Ala Thr lie Gin Ser Asn Ala Tyr Asp Asn Ala 

115 120 125 

ctg aaa caa caa ggc tat ttg aat gtt teg cat tta gee act tct ctt 
432 

Leu Lys Gin Gin Gly Tyr Leu Asn Val Ser His Leu Ala Thr Ser Leu 
130 135 140 

ttt gtg cct ttg att gaa gaa aat att tta gag ggc gaa ttg eta gaa 
480 

Phe Val Pro Leu lie Glu Glu Asn lie Leu Glu Gly Glu Leu Leu Glu 
145 150 155 160 

act tgc atg cgt tat tat ttc act cca tta gag ate ttg cct gaa gtg 
528 

Thr Cys Met Arg Tyr Tyr Phe Thr Pro Leu Glu lie Leu Pro Glu Val 
165 170 175 

gtt att tta ggc tgc acg cat ttt ccc ttg ate get cac caa att gag 
576 

Val lie Leu Gly Cys Thr His Phe Pro Leu lie Ala His Gin lie Glu 
180 185 190 

ggc tat ttt- atg gag cat ttt gee ctt tea acg ccc ccc eta etc ate 
624 

Gly Tyr Phe Met Glu His Phe Ala Leu Ser Thr Pro Pro Leu Leu lie 
195 200 205 

cat tct ggc gat get att gtg gaa tat ttg cag caa aaa tac gee ctt 
672 

His Ser Gly Asp Ala lie Val Glu Tyr Leu Gin Gin Lys Tyr Ala Leu 
210 215 220 

aag aaa aac gca tgt gca ttc cct aaa gta gaa ttt cat gcg age ggc 
720 

Lys Lys Asn Ala Cys Ala Phe Pro Lys Val Glu Phe His Ala Ser Gly 
225 230 235 240 
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gat gta att tgg eta gaa aaa cag get aaa gaa tgg etc aaa ttg taa 
768 

Asp Val lie Trp Leu Glu Lys Gin Ala Lys Glu Trp Leu Lys Leu 
245 250 255 

<210> 28 
<211> 255 
<212> PRT 
<213> H. pylori 

<400> 28 

Met Lys lie Gly Val Phe Asp Ser Gly Val Gly Gly Phe Ser Val Leu 
15 10 15 



Lys Ser Leu Leu Lys Ala Gin lie Phe Asp Glu He He Tyr Tyr Gly 
20 25 30 



Asp Ser Ala Arg Val Pro Tyr Gly Thr Lys Asp Pro Thr Thr He Lys 
35 40 45 



Gin Phe Gly Leu Glu Ala Leu Asp Phe Phe Lys Pro His Lys He Glu 
50 55 60 



Leu Leu He Val Ala Cys Asn Thr Ala Ser Ala Leu Ala Leu Glu Glu 
65 70 75 80 



Met Gin Lys His Ser Lys He Pro He Val Gly Val He Glu Pro Ser 
85 90 95 



He Leu Ala He Lys Gin Gin Val Lys Asp Lys Asn Ala Pro He Leu 
100 105 110 



Val Leu Gly Thr Lys Ala Thr He Gin Ser Asn Ala Tyr Asp Asn Ala 
115 120 125 



Leu Lys Gin Gin Gly Tyr Leu Asn Val Ser His Leu Ala Thr Ser Leu 
130 135 140 



Phe Val Pro Leu He Glu Glu Asn He Leu Glu Gly Glu Leu Leu Glu 
145 150 155 160 



Thr Cys Met Arg Tyr Tyr Phe Thr Pro Leu Glu He Leu Pro Glu Val 
165 170 175 



Val He Leu Gly Cys Thr His Phe Pro Leu He Ala His Gin He Glu 
180 185 190 
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Gly Tyr Phe Met Glu His Phe Ala Leu Ser Thr Pro Pro Leu Leu lie 
195 200 205 



His Ser Gly Asp Ala lie Val Glu Tyr Leu Gin Gin Lys Tyr Ala Leu 
210 215 220 



Lys Lys Asn Ala Cys Ala Phe Pro Lys Val Glu Phe His Ala Ser Gly 
225 230 235 240 



Asp Val lie Trp Leu Glu Lys Gin Ala Lys Glu Trp Leu Lys Leu 
245 250 255 

<210> 29 
<211> 768 
<212> DNA 
<213> H. pylori 

<221> CDS 

<222> (1) . . (768) 

<400> 29 

atg aaa ata ggc gtt ttt gat age ggt gtg gga ggg ttt age gtt tta 
48 

Met Lys lie Gly Val Phe Asp Ser Gly Val Gly Gly Phe Ser Val Leu 
1 5 10 15 

aaa age ctt tta aaa gtg caa tta ttt gat gaa ate ate tat tat ggc 
96 

Lys Ser Leu Leu Lys Val Gin Leu Phe Asp Glu lie lie Tyr Tyr Gly 
20 25 30 

gat agt get agg gtg cct tat ggc act aaa gac ccc ace acg ate aag 
144 

Asp Ser Ala Arg Val Pro Tyr Gly Thr Lys Asp Pro Thr Thr lie Lys 
35 40 45 

caa ttt ggc tta gag get ttg gat ttt ttc aaa ccg cac aag att gaa 
192 

Gin Phe Gly Leu Glu Ala Leu Asp Phe Phe Lys Pro His Lys lie Glu 
50 55 60 

tta ttg att gtg gca tgc aac aca gcg age get eta get tta gga gag 
240 

Leu Leu lie Val Ala Cys Asn Thr Ala Ser Ala Leu Ala Leu Gly Glu 
65 70 75 80 

atg caa aag tat tec aaa ate cct att gtg ggc gtg att gag cca age 
288 

Met Gin Lys Tyr Ser Lys lie Pro lie Val Gly Val lie Glu Pro Ser 
85 90 95 

att tta gcg ate aaa caa caa gta aaa gat aaa aac gec cct att tta 
336 

lie Leu Ala lie Lys Gin Gin Val Lys Asp Lys Asn Ala Pro lie Leu 
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100 105 110 

gta eta ggg aca aaa gcg acg att cga tec aac get tat gac aac gee 
384 

Val Leu Gly Thr Lys Ala Thr lie Arg Ser Asn Ala Tyr Asp Asn Ala 
115 120 125 

ctg aaa caa caa ggc tat ttg aat att teg cat tta gee act tct ctt 
432 

Leu Lys Gin Gin Gly Tyr Leu Asn lie Ser His Leu Ala Thr Ser Leu 
130 135 140 

ttt gtg cct ttg att gaa gaa aat att tta gag ggc gaa ttg eta gaa 
480 

Phe Val Pro Leu lie Glu Glu Asn lie Leu Glu Gly Glu Leu Leu Glu 

145 150 155 160 

act tgc atg cgt tat tat ttc act cca tta gag att tta cct gaa gtg 
528 

Thr Cys Met Arg Tyr Tyr Phe Thr Pro Leu Glu lie Leu Pro Glu Val 
165 170 175 

gtt att tta ggt tgc acg cat ttt ccc ttg ate get cac caa att gag 
576 

Val lie Leu Gly Cys Thr His Phe Pro Leu lie Ala His Gin lie Glu 
180 185 190 

ggc tat ttt atg gag cat ttt gee ctt tea acg ccc ccc eta etc ate 
624 

Gly Tyr Phe Met Glu His Phe Ala Leu Ser Thr Pro Pro Leu . Leu lie 

195 200 205 

cat tct ggc gat get att gtg gaa tat ttg caa caa aaa tac gee ctt 
672 

His Ser Gly Asp Ala lie Val Glu Tyr Leu Gin Gin Lys Tyr Ala Leu 
210 215 220 

aag aaa aac gca tgc gca ttc cct aaa gta gaa ttc cat gcg age ggc 
720 

Lys Lys Asn Ala Cys Ala Phe Pro Lys Val Glu Phe His Ala Ser Gly 

225 230 235 240 

gat gta att tgg eta gaa aaa cag get aaa gaa tgg etc aaa ttg taa 
768 

Asp Val lie Trp Leu Glu Lys Gin Ala Lys Glu Trp Leu Lys Leu 
245 250 255 

c210> 30 
<211> 255 
<212> PRT 
<213> H. pylori 

<400> 30 

Met Lys He Gly Val Phe Asp Ser Gly Val Gly Gly Phe Ser Val Leu 
15 10 15 



Lys Ser Leu Leu Lys Val Gin Leu Phe Asp Glu He He Tyr Tyr Gly 
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20 25 30 



Asp Ser Ala Arg Val Pro Tyr Gly Thr Lys Asp Pro Thr Thr lie Lys 
35 40 45 



Gin Phe Gly Leu Glu Ala Leu Asp Phe Phe Lys Pro His Lys lie Glu 
50 55 60 



Leu Leu lie Val Ala Cys Asn Thr Ala Ser Ala Leu Ala Leu Gly Glu 
65 70 75 80 



Met Gin Lys Tyr Ser Lys lie Pro lie Val Gly Val lie Glu Pro Ser 
85 90 95 



lie Leu Ala lie Lys Gin Gin Val Lys Asp Lys Asn Ala Pro lie Leu 
100 105 110 



Val Leu Gly Thr Lys Ala Thr lie Arg Ser Asn Ala Tyr Asp Asn Ala 
115 120 125 



Leu Lys Gin Gin Gly Tyr Leu Asn lie Ser His Leu Ala Thr Ser Leu 
130 135 140 



Phe Val Pro Leu lie Glu Glu Asn lie Leu Glu Gly Glu Leu Leu Glu 
145 150 155 160 



-Thr Cys Met Arg Tyr Tyr Phe Thr Pro Leu Glu lie Leu Pro Glu Val 
165 170 175 



Val He Leu Gly Cys Thr His Phe Pro Leu He Ala His Gin He Glu 
180 185 190 



Gly Tyr Phe Met Glu His Phe Ala Leu Ser Thr Pro Pro Leu Leu He 
195 200 205 



His Ser Gly Asp Ala He Val Glu Tyr Leu Gin Gin Lys Tyr Ala Leu 
210 215 220 



Lys Lys Asn Ala Cys Ala Phe Pro Lys Val Glu Phe His Ala Ser Gly 
225 230 235 240 



Asp Val He Trp Leu Glu Lys Gin Ala Lys Glu Trp Leu Lys Leu 
245 250 255 
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<211> 768 
<212> DNA 
<213> H. pylori 

<221> CDS 

<222> (1) . . (-768) 

<400> 31 

atg aaa ata ggc gtt ttt gat age ggt gtg gga ggg ttt age gtt tta 
48 

Met Lys lie Gly Val Phe Asp Ser Gly Val Gly Gly Phe Ser Val Leu 
15 10 15 

aaa age ctt tta aaa gcg caa att ttt gat gaa ate ate tat tat ggc 
96 

Lys Ser Leu Leu Lys Ala Gin lie Phe Asp Glu lie lie Tyr Tyr Gly 
20 25 30 

gat agt get aga gtg cct tat ggc act aaa gac ccc ace acg ate aag 
144 

Asp Ser Ala Arg Val Pro Tyr Gly Thr Lys Asp Pro Thr Thr lie Lys 
35 40 45 

caa ttt ggc tta gag get ttg gat ttt ttc aaa ccg cac cag att gga 
192 

Gin Phe Gly Leu Glu Ala Leu Asp Phe Phe Lys Pro His Gin lie Gly 
50 55 60 

tta ttg att gtg gca tgc aac aca gcg age get eta get tta gaa gag 
240 

Leu Leu lie Val Ala Cys Asn Thr Ala Ser Ala Leu Ala Leu Glu Glu 
65 70 75 80 

atg caa aag cat tec aaa ate cct att gtg ggt gtg att gag cca age 
288 

Met Gin Lys His Ser Lys lie Pro lie Val Gly Val lie Glu Pro Ser 
85 90 95 

att tta gcg ate aaa caa caa gta aaa gat aaa aac gee cct att tta 
336 

lie Leu Ala lie Lys Gin Gin Val Lys Asp Lys Asn Ala Pro lie Leu 
100 105 110 

gtg tta ggg aca aaa gcg acg att caa tec aac get tat gac aac gee 
384 

Val Leu Gly Thr Lys Ala Thr lie Gin Ser Asn Ala Tyr Asp Asn Ala 
115 120 125 

ctg aaa caa caa ggc tat ttg aac gtt teg cat tta gee act tct ctt 
432 

Leu Lys Gin Gin Gly Tyr Leu Asn Val Ser His Leu Ala Thr Ser Leu 
130 135 140 

ttt gtg cct ttg att gaa gaa aat att tta gag ggc gaa ttg tta gaa 
480 

Phe Val Pro Leu lie Glu Glu Asn lie Leu Glu Gly Glu Leu Leu Glu 

145 150 155 160 
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act tgc atg cgt tat tat ttc act cca tta gag att tta cct gaa gtg 
528 

Thr Cys Met Arg Tyr Tyr Phe Thr Pro Leu Glu lie Leu Pro Glu Val 
165 170 175 

gtt att tta ggt tgc acg cat ttt ccc ttg ate get cac caa att gag 
576 

Val He Leu Gly Cys Thr His Phe Pro Leu He Ala His Gin He Glu 
180 185 - 190 

ggc tat ttt atg gag cat ttt gec ctt tea acg ccc ccc tta etc ate 
624 

Gly Tyr Phe Met Glu His Phe Ala Leu Ser Thr Pro Pro Leu Leu He 
195 200 205 

cat tct ggc gat get att gtg gaa tat ttg caa caa aaa tac acc ctt 
672 

His Ser Gly Asp Ala He Val Glu Tyr Leu Gin Gin Lys Tyr Thr Leu 
210 215 220 

aag aaa aat gca tgc gcg ttc cct aaa gtg gaa ttt cat gcg age ggc 
720 

Lys Lys Asn Ala Cys Ala Phe Pro Lys Val Glu Phe His Ala Ser Gly 
225 230 235 240 

gat gtg gtt tgg eta gaa aaa cag get aaa gaa tgg etc aaa ttg taa 
768 

Asp Val Val Trp Leu Glu Lys Gin Ala Lys Glu Trp Leu Lys Leu 
245 250 255 

<210> 32 
<211> 255 
<212> PRT 
<213> H. pylori 

<400> 32 

Met Lys He Gly Val Phe Asp Ser Gly Val Gly Gly Phe Ser Val Leu 
15 10 15 



Lys Ser Leu Leu Lys Ala Gin He Phe Asp Glu lie He Tyr Tyr Gly 
20 25 30 



Asp Ser Ala Arg Val Pro Tyr Gly Thr Lys Asp Pro Thr Thr He Lys 
35 40 45 



Gin Phe Gly Leu Glu Ala Leu Asp Phe Phe Lys Pro His Gin He Gly 
50 55 60 



Leu Leu He Val Ala Cys Asn Thr Ala Ser Ala Leu Ala Leu Glu Glu 
65 70 75 80 



Met Gin Lys His Ser Lys He Pro He Val Gly Val He Glu Pro Ser 
85 90 95 
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lie Leu Ala lie Lys Gin Gin Val Lys Asp Lys Asn Ala Pro lie Leu 
100 105 110 



Val Leu Gly Thr Lys Ala Thr lie Gin Ser Asn Ala Tyr Asp Asn Ala 
115 120 125 



Leu Lys Gin Gin Gly Tyr Leu Asn Val Ser His Leu Ala Thr Ser Leu 
130 135 140 



Phe Val Pro Leu lie Glu Glu Asn lie Leu Glu Gly Glu Leu Leu Glu 
145 150 155 160 



Thr Cys Met Arg Tyr Tyr Phe Thr Pro Leu Glu lie Leu Pro Glu Val 
165 170 175 



Val lie Leu Gly Cys Thr His Phe Pro Leu lie Ala His Gin lie Glu 
180 185 190 



Gly Tyr Phe Met Glu His Phe Ala Leu Ser Thr Pro Pro Leu Leu lie 
195 200 205 



His Ser Gly Asp Ala lie Val Glu Tyr Leu Gin Gin Lys Tyr Thr Leu 
210 215 220 



Lys Lys Asn Ala Cys Ala Phe Pro Lys Val Glu Phe His Ala Ser Gly 
225 230 235 240 



Asp Val Val Trp Leu Glu Lys Gin Ala Lys Glu Trp Leu Lys Leu 
245 250 255 

<210> 33 
<211> 765 
<212> DNA 
<213> H. pylori 

<221> CDS 

<222> (1) . . (765) 

<400> 33 

atg aaa ata ggc gtt ttt gat age ggt gtg gga ggg ttt age gtt tta 
48 

Met Lys He Gly Val Phe Asp Ser Gly Val Gly Gly Phe Ser Val Leu 
15 10 15 

aaa age ctt tta aaa gcg caa eta ttt gat gaa ate ate tat tat ggc 
96 

Lys Ser Leu Leu Lys Ala Gin Leu Phe Asp Glu He He Tyr Tyr Gly 
20 25 30 
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gat age get aga 
144 

Asp Ser Ala Arg 
35 

caa ttt ggc tta 
192 

Gin Phe Gly Leu 
50 

tta ttg att gtg 
240 

Leu Leu lie Val 
65 

atg caa aag cat 
288 

Met Gin Lys His 



att tta gcg ate 
336 

lie Leu Ala He 
100 

gtg eta ggg aca 
384 

Val Leu Gly Thr 
115 

ctg aaa caa caa 
432 

Leu Lys Gin Gin 
130 

ttt gtg cct ttg 
480 

Phe Val Pro Leu 
145 

act tgc atg cgt 
528 

Thr Cys Met Arg 



gtt att tta ggt 
576 

Val He Leu Gly 
180 

ggc tat ttt atg 
624 

Gly Tyr Phe Met 
195 

cat tct ggc gat 
672 

His Ser Gly Asp 



gtg cct tat ggc 

Val Pro Tyr Gly 
40 

gag get ttg gat 

Glu Ala Leu Asp 
55 

gca tgc aac acc 

Ala Cys Asn Thr 
70 

tec aaa ate cct 

Ser Lys He Pro 
85 

aaa egg caa gtg 
Lys Arg Gin Val 

aaa gcg acg att 

Lys Ala Thr He 
120 

ggc tat ttg aat 

Gly Tyr Leu Asn 
135 

att gaa gaa agt 

He Glu Glu Ser 
150 

tat tat ttc act 

Tyr Tyr Phe Thr 
165 

tgc acg cat ttt 

Cys Thr His Phe 

gag cat ttt gee 

Glu His Phe Ala 
200 

get att gtg gaa 
Ala He Val Glu 



act aaa gac ccc 
Thr Lys Asp Pro 

ttt ttc aaa ccg 

Phe Phe Lys Pro 
60 

gca age get ctg 

Ala Ser Ala Leu 
75 

gtt gtg ggc gtg 

Val Val Gly Val 
90 

aaa gat aaa aac 

Lys Asp Lys Asn 
105 

caa tec aac gee 
Gin Ser Asn Ala 

gtt teg cat tta 

Val Ser His Leu 
140 

att tta gag ggc 

lie Leu Glu Gly 
155 

cca tta gag att 

Pro Leu Glu He 
170 

ccc ttg ate get 

Pro Leu He Ala 
185 

ctt tea acg ccc 
Leu Ser Thr Pro 

tat ttg caa caa 
Tyr Leu Gin Gin 



acc acg ate aag 

Thr Thr He Lys 
45 

cac cag att aaa 
His Gin He Lys 

get tta gaa gag 

Ala Leu Glu Glu 
80 

att gag cca age 

He Glu Pro Ser 
95 

gee cct att ttg 

Ala Pro He Leu 
110 

tat gat aac gee 

Tyr Asp Asn Ala 
125 

gee act tct ctt 
Ala Thr Ser Leu 

gaa ttg eta gaa 

Glu Leu Leu Glu 
160 

tta cct gaa gtg 

Leu Pro Glu Val 
175 

caa aaa att gag 

Gin Lys lie Glu 
190 

ccc eta etc ate 

Pro Leu Leu lie 
205 

aat tac gee ctt 
Asn Tyr Ala Leu 
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210 215 220 

aag aaa aac gca tgc gcg ttc cct aaa gtg gaa ttt cat gcg age ggc 
720 

Lys Lys Asn Ala Cys Ala Phe Pro Lys Val Glu Phe His Ala Ser Gly 
225 230 235 240 

gat gtg gtt tgg eta gaa aaa caa get aaa gaa tgg ctt aaa ttg 
765 

Asp Val Val Trp Leu Glu Lys Gin Ala Lys Glu Trp Leu Lys Leu 
245 250 255 

<210> 34 
<211> 255 
<212> PRT 
<213> H. pylori 

<400> 34 

Met Lys lie Gly Val Phe Asp Ser Gly Val Gly Gly Phe Ser Val Leu 
15 10 15 



Lys Ser Leu Leu Lys Ala Gin Leu Phe Asp Glu lie lie Tyr Tyr Gly 
20 25 30 



Asp Ser Ala Arg Val Pro Tyr Gly Thr Lys Asp Pro Thr Thr lie Lys 
35 40 45 



Gin Phe Gly Leu Glu Ala Leu Asp Phe Phe Lys Pro His Gin lie Lys 
50 55 60 



Leu Leu lie Val Ala Cys Asn Thr Ala Ser Ala Leu Ala Leu Glu Glu 
65 70 75 80 



Met Gin Lys His Ser Lys lie Pro Val Val Gly Val lie Glu Pro Ser 
85 90 95 



lie Leu Ala lie Lys Arg Gin Val Lys Asp Lys Asn Ala Pro lie Leu 
100 105 110 



Val Leu Gly Thr Lys Ala Thr lie Gin Ser Asn Ala Tyr Asp Asn Ala 
115 120 125 



Leu Lys Gin Gin Gly Tyr Leu Asn Val Ser His Leu Ala Thr Ser Leu 
130 135 140 



Phe Val Pro Leu lie Glu Glu Ser lie Leu Glu Gly Glu Leu Leu Glu 
145 150 155 160 
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Thr Cys Met Arg Tyr Tyr Phe Thr Pro Leu Glu lie Leu Pro Glu Val 
165 170 175 



Val He Leu Gly Cys Thr His Phe Pro Leu He Ala Gin Lys He Glu 
180 185 190 



Gly Tyr Phe Met Glu His Phe Ala Leu Ser Thr Pro Pro Leu Leu He 
195 200 205 



His Ser Gly Asp Ala He Val Glu Tyr Leu Gin Gin Asn Tyr Ala Leu 
210 215 220 



Lys Lys Asn Ala Cys Ala Phe Pro Lys Val Glu Phe His Ala Ser Gly 
225 230 235 240 



Asp Val Val Trp Leu Glu Lys Gin Ala Lys Glu Trp Leu Lys Leu 
245 250 255 



<210> 


35 


<211> 


29 


<212> 


DNA 


<213> 


Artificial Sequence 


<220> 




<223> 


primer 


<400> 


35 


aaatagtcat atgaaaatag gcgtttttg 


29 




<210> 


36 


<211> 


28 


<212> 


DNA 


<213> 


Artificial Sequence 


<220> 




<223> 


primer 


<400> 


36 


agaattctat tacaatttga gccattct 


28 




<210> 


37 


<211> 


26 


<212> 


DNA 


<213> 


Artificial Sequence 


<220> 




<223> 


primer 



<400> 37 

gcgaattcga tcagaatttt ttttct 
26 
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<210> 38 

<211> 26 

<212> DNA 

<213> Artificial Sequence 

5 

<220> 

<223> primer 

<400> 38 
10 ataagtactt gtgaatctta tactag 
26 

<210> 39 

<211> 858 

15 <212> DNA 

<213> E. coli 

<400> 39 

atggctacca aactgcagga cgggaataca ccttgtctgg cagctacacc ttctgaacca 
20 60 

cgtcccaccg tgctggtgtt tgactccggc gtcggtgggt tgtcggtcta tgacgagatc 
120 

25 cggcatctct taccggatct ccattacatt tatgctttcg ataacgtcgc tttcccgtat 
180 

ggcgaaaaaa gcgaagcgtt tattgttgag cgagtggtgg caattgtcac cgcggtgcaa 
240 



30 



45 



gaacgttatc cccttgcgct ggctgtggtc gcttgcaaca ctgccagtac cgtttcactt 
300 



cctgcattac gcgaaaagtt cgacttcccg gttgttggtg tcgtgccggc gattaaacct 
35 3 60 

gctgcacgtc tgacggcaaa tggcattgtc ggattactgg caacccgcgg aacagttaaa 
420 

40 cgttcttata ctcatgagct gatcgcgcgt ttcgctaatg aatgccagat agaaatgctg 
480 



ggctcggcag agatggttga gttggctgaa gcgaagctac atggcgaaga tgtttctctg 
540 

gatgcactaa aacgtatcct acgcccgtgg ttaagaatga aagagccgcc agataccgtt 
600 



gtattgggtt gcacccattt ccctctacta caagaagaac tgttacaagt gctgccagag 
50 660 

ggaacccggc tggtggattc tggcgcagcg attgctcgcc gaacggcctg gttgttagaa 
720 

55 catgaagccc cggatgcaaa atctgccgat gcgaatattg ccttttgtat ggcaatgacg 
780 



ccaggagctg aacaattatt gcccgtttta cagcgttacg gcttcgaaac gctcgaaaaa 
840 
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ctggcagttt taggctga 
858 

<210> 40 
<211> 285 
<212> PRT 
<213> E. coli 

<400> 40 

Met Ala Thr Lys Leu Gin Asp Gly Asn Thr Pro Cys Leu Ala Ala Thr 
15 10 15 



Pro Ser Glu Pro Arg Pro Thr Val Leu Val Phe Asp Ser Gly Val Gly 
20 25 30 



Gly Leu Ser Val Tyr Asp Glu lie Arg His Leu Leu Pro Asp Leu His 
35 40 45 



Tyr lie Tyr Ala Phe Asp Asn Val Ala Phe Pro Tyr Gly Glu Lys Ser 
50 55 60 



Glu Ala Phe lie Val Glu Arg Val Val Ala He Val Thr Ala Val Gin 
65 70 75 80 



Glu Arg Tyr Pro Leu Ala Leu Ala Val Val Ala Cys Asn Thr Ala Ser 
85 90 95 



Thr Val Ser Leu Pro Ala Leu Arg Glu Lys Phe Asp Phe Pro Val Val 
100 105 110 



Gly Val Val Pro Ala He Lys Pro Ala Ala Arg Leu Thr Ala Asn Gly 
115 120 125 



He Val Gly Leu Leu Ala Thr Arg Gly Thr Val Lys Arg Ser Tyr Thr 
130 135 140 



His Glu Leu He Ala Arg Phe Ala Asn Glu Cys Gin He Glu Met Leu 
145 150 155 160 



Gly Ser Ala Glu Met Val Glu Leu Ala Glu Ala Lys Leu His Gly Glu 
165 170 175 



Asp Val Ser Leu Asp Ala Leu Lys Arg He Leu Arg Pro Trp Leu Arg 
180 185 190 
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Met Lys Glu Pro Pro Asp Thr Val Val Leu Gly Cys Thr His Phe Pro 
195 200 205 



5 Leu Leu Gin Glu Glu Leu Leu Gin Val Leu Pro Glu Gly Thr Arg Leu 
210 215 220 



10 Val Asp Ser Gly Ala Ala lie Ala Arg Arg Thr Ala Trp Leu Leu Glu 
225 230 235 240 



His Glu Ala Pro Asp Ala Lys Ser Ala Asp Ala Asn lie Ala Phe Cys 
15 245 250 255 



20 



Met Ala Met Thr Pro Gly Ala Glu Gin Leu Leu Pro Val Leu Gin Arg 
260 265 270 



Tyr Gly Phe Glu Thr Leu Glu Lys Leu Ala Val Leu Gly 
275 280 285 

25 <210> 41 

<211> 29 

<212> DNA 

<213> Artificial Sequence 

30 <220> 

<223> primer 

<400> 41 

aaatagtcat atgaaaatag gcgtttttg 
35 29 

<210> 42 

<211> 28 

<212> DNA 

40 <213> Artificial Sequence 

<220> 

<223> primer 

45 <400> 42 

agaattctat tacaatttga gccattct 
28 

<210r> 43 

50 <211> 822 

<212> DNA 

<213> E. faecalis 

<400> 43 

55 atgagcaatc aagaagccat tggattaatt gattctggcg ttggtggatt aactgtttta 
60 



aaggaagcgc taaagcaatt accaaatgaa cgattaattt atttaggaga tacagcccgt 
120 
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tgcccatatg gtccacgacc agccgaacaa gtcgttcagt ttacttggga aatggccgat 
180 

tttttattga aaaaacgaat aaaaatgcta gtaatcgcat gtaataccgc gacggctgtc 
24 0 

gcattagaag aaattaaagc tgccttgcca attccagttg ttggtgttat tttacctggc 
300 

gcacgagcag ccgttaaagt cacaaaaaat aacaaaattg gtgtcatagg taccttaggg 
360 



acaatcaaaa gtgcttccta tgaaatcgcc attaaaagta aggcaccagc aattgaggtg 
15 420 

actagtttag cttgccctaa atttgtcccc attgttgaaa gtaatcaata tcgttcttcc 
480 

20 gtagcaaaaa aaattgtggc agaaacactt caagcactac aattaaaagg acttgatacg 
540 



ttgattttag gttgtaccca ttacccgttg ttacgtccgg tgattcaaaa tgtgatgggg 
600 

agtcatgtga cattaattga ctcaggagcc gaaacagttg gcgaagtcag catgcttctc 
660 



gattattttg acattgccca cacgcctgaa gcgcctacac agccccatga attttataca 
30 720 

actggttctg caaaaatgtt tgaagagatt gcaagcagtt ggcttggtat agagaactta 
780 

35 aaagcacaac agattcactt aggaggaaac gaaaatgatt ag 
822 

<210> 44 
<211> 273 
40 <212> PRT 

<213> E. faecalis 

<400> 44 

45 Met Ser Asn Gin Glu Ala lie Gly Leu lie Asp Ser Gly Val Gly Gly 
15 10 15 



Leu Thr Val Leu Lys Glu Ala Leu Lys Gin Leu Pro Asn Glu Arg Leu 
50 20 25 30 



lie Tyr Leu Gly Asp Thr Ala Arg Cys Pro Tyr Gly Pro Arg Pro Ala 
35 40 45 



Glu Gin Val Val Gin Phe Thr Trp Glu Met Ala Asp Phe Leu Leu Lys 
50 55 60 
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Lys Arg He Lys Met Leu Val He Ala Cys Asn Thr Ala Thr Ala Val 
65 70 75 80 



Ala Leu Glu Glu He Lys Ala Ala Leu Pro He Pro Val Val Gly Val 
85 90 95 



He Leu Pro Gly Ala Arg Ala Ala Val Lys Val Thr Lys Asn Asn Lys 
100 105 110 



He Gly Val He Gly Thr Leu Gly Thr He Lys Ser Ala Ser Tyr Glu 
115 120 125 



He Ala He Lys Ser Lys Ala Pro Ala He Glu Val Thr Ser Leu Ala 
130 135 140 



Cys Pro Lys Phe Val Pro He Val Glu Ser Asn Gin Tyr Arg Ser Ser 
145 150 155 160 



Val Ala Lys Lys He Val Ala Glu Thr Leu Gin Ala Leu Gin Leu Lys 
165 170 175 



Gly Leu Asp Thr Leu He Leu Gly Cys Thr His Tyr Pro Leu Leu Arg 
180 185 190 



Pro Val He Gin Asn Val Met Gly Ser His Val Thr Leu He Asp Ser 
195 200 205 



Gly Ala Glu Thr Val Gly Glu Val Ser Met Leu Leu Asp Tyr Phe Asp 
210 215 220 



He Ala His Thr Pro Glu Ala Pro Thr Gin Pro His Glu Phe Tyr Thr 
225 230 235 240 



Thr Gly Ser Ala Lys Met Phe Glu Glu He Ala Ser Ser Trp Leu Gly 
245 250 255 



He Glu Asn Leu Lys Ala Gin Gin He His Leu Gly Gly Asn Glu Asn 
260 265 270 



Asp 

<210> 45 

<211> 801 

<212> DNA 

<213> S. aureus 
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<400> 45 

atgaataaac caataggtgt aatagactct ggtgtcggag gtttgacagt agctaaagaa 
60 

5 

attatgcgtc agttgccaaa tgagacgatt tattacttag gtgatattgg gcgatgtcca 
120 

tatgggccaa gaccaggaga acaagtaaaa caatatacag ttgaaatcgc tcgtaaatta 
10 180 

atggaatttg atataaaaat gctcgtgatt gcttgtaata ctgcaactgc tgtagcttta 
240 

15 gaatatttac aaaagacctt atcaatctca gtgattggcg taattgaacc aggtgctaga 
300 



20 



35 



55 



acagcaataa tgacgactag aaatcaaaat gtattagtac taggaacgga aggcacaatt 
360 

aaatctgaag catatcgaac acatattaaa cgtataaatc cacatgtaga ggtacatggc 
420 



gttgcctgtc caggttttgt gccacttgta gaacaaatga gatatagtga tccaacaatt 
25 480 

acaagcattg ttattcatca aacactgaaa cgttggcgta atagtgagtc tgatactgtc 
540 

30 attttaggat gtacccacta tccattgctc tataaaccta tctatgatta ttttggtggt 
600 



aaaaagacag tgatttcgtc tggattagaa acggctcgtg aagttagtgc attgctaaca 
660 

tttagtaatg aacatgcaag ttatactgaa catccagatc atcgattttt tgcaacaggt 
720 



gataccacac atattactaa cattatcaaa gaatggctaa atttatctgt caatgtggaa 
40 780 

cgtatatcag tgaatgacta g 
801 

45 <210> 46 
<211> 266 
<212> PRT 
<213> S. aureus 

50 <400> 46 

Met Asn Lys Pro lie Gly Val lie Asp Ser Gly Val Gly. Gly Leu Thr 
15 10 15 



Val Ala Lys Glu lie Met Arg Gin Leu Pro Asn Glu Thr He Tyr Tyr 
20 25 30 
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Leu Gly Asp lie Gly Arg Cys Pro Tyr Gly Pro Arg Pro Gly Glu Gin 
35 40 45 



Val Lys Gin Tyr Thr Val Glu He Ala Arg Lys Leu Met Glu Phe Asp 
50 55 60 



He Lys Met Leu Val He Ala Cys Asn Thr Ala Thr Ala Val Ala Leu 
65 70 75 80 



Glu Tyr Leu Gin Lys Thr Leu Ser He Ser Val He Gly Val He Glu 
85 90 95 



Pro Gly Ala Arg Thr Ala He Met Thr Thr Arg Asn Gin Asn Val Leu 
100 105 110 



Val Leu Gly Thr Glu Gly Thr He Lys Ser Glu Ala Tyr Arg Thr His 
115 120 125 



He Lys Arg He Asn Pro His Val Glu Val His Gly Val Ala Cys Pro 
130 135 140 



Gly Phe Val Pro Leu Val Glu Gin Met Arg Tyr Ser Asp Pro Thr He 
145 150 155 160 



Thr Ser He Val He His Gin Thr Leu Lys Arg Trp Arg Asn Ser Glu 
165 170 175 



Ser Asp Thr Val He Leu Gly Cys Thr His Tyr Pro Leu Leu Tyr Lys 
180 185 190 



Pro He Tyr Asp Tyr Phe Gly Gly Lys Lys Thr Val He Ser Ser Gly 
195 200 205 



Leu Glu Thr Ala Arg Glu Val Ser Ala Leu Leu Thr Phe Ser Asn Glu 
210 215 220 



His Ala Ser Tyr Thr Glu His Pro Asp His Arg Phe Phe Ala Thr Gly 
225 230 235 240 



Asp Thr Thr His He Thr Asn He He Lys Glu Trp Leu Asn Leu Ser 
245 250 255 



Val Asn Val Glu Arg He Ser Val Asn Asp 
260 265 
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<210> 47 
<211> 822 
<212> DNA 
<213> E . f aecium 

5 

<400> 47 

atgatacgat tgacagataa tcgccctatc ggatttattg attcaggtgt cggcggcttg 
60 

10 actgtagtaa aagaagccct gaaacaatta ccgaatgaaa atattttatt tgtaggagac 
120 



15 



30 



45 



55 



acagcacgct gcccatatgg ccctagaccc gcggaacagg taatccagta tacttgggaa 
180 

atgacggatt atctggtgga gcaaggaatc aagatgctgg tgatcgcctg caataccgca 
240 



actgcggtgg ctttagaaga aatcaaagct gctctttcta ttccagtcat cggtgtgatc 
20 300 

cttcccggta ctagagcggc agtaaaaaaa acacaaaata aacaagttgg cattatcggt 
360 

25 acgattggta cggtaaaaag tcaagcttat gaaaaagcac tgaaagagaa agtaccagaa 
420 



ttgactgtga caagtcttgc ttgtccaaaa tttgtttcag ttgtcgaaag taatgaatac 
480 

cattcatcgg tggcgaaaaa aattgtggca gaaacattag ctcctttaac cactaaaaaa 
540 



atcgatacat tgattttggg atgcacccat tatccattat tacgccccat cattcaaaat 
35 600 

gtaatgggag aaaatgttca actgatcgat tctggagcag aaacagtagg tgaagtatct 
660 

40 atgctgttag attatttcaa tctgagcaat tcaccgcaaa atggtcggac attatgccag 
720 



ttttatacaa ctggctctgc caaacttttc gaggaaatag ctgaagactg gcttggaatc 
780 

ggacacttaa atgtagaaca tatcgaattg ggaggaaaat aa 
822 



<210> 48 

50 <211> 273 

<212> PRT 

<213> E . f aecium 



<400> 48 

Met He Arg Leu Thr Asp Asn Arg Pro He Gly Phe He Asp Ser Gly 
15 10 15 
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Val Gly Gly Leu Thr Val Val Lys Glu Ala Leu Lys Gin Leu Pro Asn 
20 25 30 



Glu Asn lie Leu Phe Val Gly Asp Thr Ala Arg Cys Pro Tyr Gly Pro 
35 40 45 



Arg Pro Ala Glu Gin Val lie Gin Tyr Thr Trp Glu Met Thr Asp Tyr 
50 55 60 



Leu Val Glu Gin Gly lie Lys Met Leu Val lie Ala Cys Asn Thr Ala 
65 70 75 80 



Thr Ala Val Ala Leu Glu Glu lie Lys Ala Ala Leu Ser lie Pro Val 
85 90 95 



lie Gly Val lie Leu Pro Gly Thr Arg Ala Ala Val Lys Lys Thr Gin 
100 105 110 



Asn Lys Gin Val Gly lie lie Gly Thr lie Gly Thr Val Lys Ser Gin 
115 120 125 



Ala Tyr Glu Lys Ala Leu Lys Glu Lys Val Pro Glu Leu Thr Val Thr 
130 135 140 



Ser Leu Ala Cys Pro Lys Phe Val Ser Val Val Glu Ser Asn Glu Tyr 
145 150 155 160 



His Ser Ser Val Ala Lys Lys lie Val Ala Glu Thr Leu Ala Pro Leu 
165 170 175 



Thr Thr Lys Lys lie Asp Thr Leu lie Leu Gly Cys Thr His Tyr Pro 
180 185 190 



Leu Leu Arg Pro lie lie Gin Asn Val Met Gly Glu Asn Val Gin Leu 
195 200 205 



lie Asp Ser Gly Ala Glu Thr Val Gly Glu Val Ser Met Leu Leu Asp 
210 215 220 



Tyr Phe Asn Leu Ser Asn Ser Pro Gin Asn Gly Arg Thr Leu Cys Gin 
225 230 235 240 



Phe Tyr Thr Thr Gly Ser Ala Lys Leu Phe Glu Glu lie Ala Glu Asp 
245 250 255 
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Trp Leu Gly lie Gly His Leu Asn Val Glu His lie Glu Leu Gly Gly 
260 265 270 



Lys 

<210> 49 

<211> 335 

10 <212> DNA 

<213> E. saccharolyticus 

<400> 49 

gcatgtaata ccgcaacggc ggtagcgtta gaagaaatta aagcgcaatt agatattcca 
15 60 

gtcgtcggtg tgatcttacc tggtactcgt gctgcagtta aagctacgaa aaatcgtcaa 
120 

20 atcggtatta taggaacagc gggtacaatt aaaagtagtt cgtatgagca agcaattaaa 
180 



25 



45 



50 



atgaaagtgc ctgaagcatc ggtgactagt ttagcttgtc ctaaatttgt accgattgtt 
240 

gaaagtaatc aatttcaatc atcggtagct aaaaaaattg ttgctgagac gttattacca 
300 



ttgcaacata aaaaattaga tacgttgatt ttagg 
30 335 

<210> 50 

<211> 111 

<212> PRT 

35 <213> E. saccharolyticus 

<400> 50 

Ala Cys Asn Thr Ala Thr Ala Val Ala Leu Glu Glu lie Lys Ala Gin 
40 1 5 10 15 



Leu Asp lie Pro Val Val Gly Val lie Leu Pro Gly Thr Arg Ala Ala 
20 25 30 



Val Lys Ala Thr Lys Asn Arg Gin lie Gly He He Gly Thr Ala Gly 
35 40 45 



Thr He Lys Ser Ser Ser Tyr Glu Gin Ala He Lys Met Lys Val Pro 
50 55 60 



55 Glu Ala Ser Val Thr Ser Leu Ala Cys Pro Lys Phe Val Pro He Val 
65 70 75 80 



Glu Ser Asn Gin Phe Gin Ser Ser Val Ala Lys Lys He Val Ala Glu 
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15 



30 



35 



40 



45 
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85 90 95 



Thr Leu Leu Pro Leu Gin His Lys Lys Leu Asp Thr Leu lie Leu 
100 105 110 



<210> 


51 


<211> 


344 


<212> 


DNA 


<213> 


E. 


<400> 


51 



gtaatcgcat gtaataccgc aactgcggtc gcattagaag aaatcaaagc aacactctcg 
60 

attccagtga tcggtgtgat tttgccagga acgagagcgg cagtcaagca gacgaaaaat 
120 

catcgagtag gggtgattgg aacaattggt accgtcaaaa gtgctgctta cgagacggca 
20 180 

ttattggata aagcacccga actgaaagtt accagcttgg cgtgtccaaa gtttgtttca 
240 

25 gtcgtagaaa gtaaagaata ccgatcatca gtcgctaaaa aaatcgtggc tcaaactttg 
300 



cttccattag aattaaaagg gatcgatacg ttgattttag gttg 
344 

<210> 52 

<211> 113 

<212> PRT 

< 2 1 3 > E . mundt i i 

<400> 52 

Val lie Ala Cys Asn Thr Ala Thr Ala Val Ala Leu Glu Glu lie Lys 
15 10 15 



Ala Thr Leu Ser lie Pro Val lie Gly Val lie Leu Pro Gly Thr Arg 
20 25 30 



Ala Ala Val Lys Gin Thr Lys Asn His Arg Val Gly Val lie Gly Thr 
35 40 45 



50 lie Gly Thr Val Lys Ser Ala Ala Tyr Glu Thr Ala Leu Leu Asp Lys 
50 55 60 



Ala Pro Glu Leu Lys Val Thr Ser Leu Ala Cys Pro Lys Phe Val Ser 
55 65 70 75 80 



Val Val Glu Ser Lys Glu Tyr Arg Ser Ser Val Ala Lys Lys lie Val 
85 90 95 
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55 



Pro His Ala Ser Val Val Ser Leu Ala Cys Pro Lys Phe Val Pro He 
65 70 75 80 
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25 



45 



50 
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Val Glu Ser Lys Gin Tyr His Ser Ser Val Ala Lys Lys lie Val Ala 
85 90 95 



Glu Thr Leu Arg Pro Leu Lys Asn Lys Arg Leu Asp Thr Leu lie Leu 

105 110 





100 


<210> 


55 


<211> 


337 


<212> 


DNA 


<213> 


E. flavescens 


<400> 


55 



10 



atcgcatgta ataccgcgac agcggtcgcc cttgaagaaa tcaaagaaca actaacgatc 
15 60 

ccagtgatcg gcgtgatcct gcctggcagt cgagcagcag tcaaagcaag caaaaaccaa 
120 

20 cgaatcggtg tcatcgggac aaacggaacg atcaaaagtg actcttacaa gcgcgcgctt 
180 



catggcaaag cgccccatgc gtccgtcgtc agtttggctt gcccgaagtt tgtgccgatc 
240 

gtagaaagca aacaatacca tagctcggtc gccaagaaaa tcgtggcaga aacgttgcgt 
300 



ccattgaaaa acaaacggct agatacgttg attttag 
30 337 



35 



<210> 


56 


<211> 


112 


<212> 


PRT 


<213> 


E. flavescens 


<400> 


56 


lie Ala Cys Asn Thr * 


1 


5 



40 1 5 10 15 



Gin Leu Thr lie Pro Val lie Gly Val lie Leu Pro Gly Ser Arg Ala 
20 25 30 



Ala Val Lys Ala Ser Lys Asn Gin Arg lie Gly Val He Gly Thr Asn 
35 40 45 



Gly Thr He Lys Ser Asp Ser Tyr Lys Arg Ala Leu His Gly Lys Ala 
50 55 60 



55 Pro His Ala Ser Val Val Ser Leu Ala Cys Pro Lys Phe Val Pro He 
65 70 75 80 



Val Glu Ser Lys Gin Tyr His Ser Ser Val Ala Lys Lys He Val Ala 
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85 90 95 



Glu Thr Leu Arg Pro Leu Lys Asn Lys Arg Leu Asp Thr Leu lie Leu 
100 105 110 

<210> 57 
<211> 341 
<212> DNA 
<213> E. cecorum 

<400> 57 

atcgcatgta ataccgcgac tgcagcagct ttaacccaaa ttaaggaaca attagacatt 
60 

ccagttgtcg gtgtgatttt acctggaact agagctgctg tcaaaaatac aaaatcgcaa 
120 

cgaattggga ttatcggcac acaaggaacc atccaaagtg gcagttatga acaagccatt 
180 

ctttctaaag taccgactgc tcaacctgtg agtttagcgt gtcctagatt tgttccgata 
240 

gtagaaagta atcaagcaaa ttcaagtgtg gcaaaaaaaa ttgtcgctca aacactacaa 
300 

ccgatgacga aaaaaaacat cgatacgttg attttaggtt g 
341 

<210> 58 
<211> 112 
<212> PRT 
<213> E. cecorum 

<400> 58 

lie Ala Cys Asn Thr Ala Thr Ala Ala Ala Leu Thr Gin lie Lys Glu 
15 10 15 



Gin Leu Asp lie Pro Val Val Gly Val lie Leu Pro Gly Thr Arg Ala 
20 25 30 



Ala Val Lys Asn Thr Lys Ser Gin Arg lie Gly lie lie Gly Thr Gin 
35 40 45 



Gly Thr lie Gin Ser Gly Ser Tyr Glu Gin Ala lie Leu Ser Lys Val 
50 55 60 



Pro Thr Ala Gin Pro Val Ser Leu Ala Cys Pro Arg Phe Val Pro lie 
65 70 75 80 



Val Glu Ser Asn Gin Ala Asn Ser Ser Val Ala Lys Lys lie Val Ala 
85 90 95 
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Gin Thr Leu Gin Pro Met Thr Lys Lys Asn He Asp Thr Leu He Leu 
100 105 110 

5 

<210> 59 
<211> 339 
<212> DNA 

<213> E. raffinosus 

10 

<400> 59 

atcgcatgta ataccgcgac ggcagtagct ttggaagaaa ttaaaagaac cgtagatatt 
60 

15 cccgtaatcg gtgttataca gccaggatct cgcgcagcgt taaaggcaag cgaaaatggg 
120 

cgcgtgggaa ttatcggaac cattggaaca gtaaaaagtg gttcttataa acacgaacta 
180 

20 

caggaaaaag ctcctgatac ttatgtttct agtttagcat gcccaaaatt tgtaccgatt 
240 

gttgaaagta atcaatttaa tagctcggta gcgaaaaaaa ttgtttctca aacattaact 
25 300 

cctttgaaaa aggaaaagtt ggatacgttg attttaggt 



30 



339 




<210> 


60 


<211> 


112 


<212> 


PRT 


<213> 


E. raffinosus 



35 <400> 60 

He Ala Cys Asn Thr Ala Thr Ala Val Ala Leu Glu Glu He Lys Arg 
15 10 15 

40 

Thr Val Asp He Pro Val He Gly Val He Gin Pro Gly Ser Arg Ala 
20 25 30 



45 Ala Leu Lys Ala Ser Glu Asn Gly Arg Val Gly He He Gly Thr He 
35 40 45 



Gly Thr Val Lys Ser Gly Ser Tyr Lys His Glu Leu Gin Glu Lys Ala 
50 50 55 60 



Pro Asp Thr Tyr Val Ser Ser Leu Ala Cys Pro Lys Phe Val Pro He 
65 70 75 80 

55 

Val Glu Ser Asn Gin Phe Asn Ser Ser Val Ala Lys Lys He Val Ser 
85 90 95 
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Gin Thr Leu Thr Pro Leu Lys Lys Glu Lys Leu Asp Thr Leu lie Leu 
100 105 110 

5 <210> 61 
<211> 341 
<212> DNA 

<213> E . malodoratus 
10 <400> 61 

atcgcatgta ataccgcaac cgcagtggct ttagaagaga ttaagaagaa cgttgatatt 
60 

cctgttattg gtgttatcca accaggatca cgtgctgcat taaaagcaag taaaaatagt 
15 120 

cgtgtaggta tcatcggaac actaggaact gttaaaagtg gatcttataa acatgagctg 
180 

20 caagaaaaag caccagaaac gtatgttgct agtctggcct gcccaaaatt tgtgccaatc 
240 



25 



35 



gttgaaagta atcagtttaa tagttctgta gccaaaaaga ttgtttcaca atctctggca 
300 

cccttaaaaa aggaaaaatt agatacgttg attttaggtt g 
341 



<210> 62 

30 <211> 112 

<212> PRT 

<213> E . malodoratus 



<400> 62 

lie Ala Cys Asn Thr Ala Thr Ala Val Ala Leu Glu Glu lie Lys Lys 
15 10 15 



40 Asn Val Asp lie Pro Val lie Gly Val He Gin Pro Gly Ser Arg Ala 
20 25 30 



Ala Leu Lys Ala Ser Lys Asn Ser Arg Val Gly He He Gly Thr Leu 
45 35 40 45 



Gly Thr Val Lys Ser Gly Ser Tyr Lys His Glu Leu Gin Glu Lys Ala 
50 55 60 

50 

Pro Glu Thr Tyr Val Ala Ser Leu Ala Cys Pro Lys Phe Val Pro He 
65 70 75 80 

55 

Val Glu Ser Asn Gin Phe Asn Ser Ser Val Ala Lys Lys He Val Ser 
85 90 95 
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Gin Ser Leu Ala Pro Leu Lys Lys Glu Lys Leu Asp Thr Leu lie Leu 
100 105 110 



<210> 


63 


<211> 


338 


<212> 


DNA 


<213> 


E. ! 


<400> 


63 



gcatgtaata ccgcaacagc tgtggcttta gatgagatta aagagcaact gcaaatccct 
60 

gttgtgggag ttattatgcc gggaaccaga gcagctgtta aagcgactaa aaatcatcgt 
120 

attggtgtga ttggcacaaa aggaacagtt aaaagtgcct cttacaaacg agcaatcaaa 
180 

gaaaaaaatg aaaatacaaa agtaacaagt ttggcttgtc cgaagtttgt tcccattgtg 
240 

gaaagtaatc aaattcattc ttcagtggca aaaaaaattg tatttgaaac actattaccc 
300 

ttaaaaaata aacatttaga tacgttgatt ttaggttg 
338 

<210> 64 

<211> 111 

<212> PRT 

<213> E. solitarus 

<400> 64 

Ala Cys Asn Thr Ala Thr Ala Val Ala Leu Asp Glu lie Lys Glu Gin 
15 10 15 



Leu Gin lie Pro Val Val Gly Val lie Met Pro Gly Thr Arg Ala Ala 
20 25 30 



Val Lys Ala Thr Lys Asn His Arg lie Gly Val lie Gly Thr Lys Gly 
35 40 45 



Thr Val Lys Ser Ala Ser Tyr Lys Arg Ala He Lys Glu Lys Asn Glu 
50 55 60 



Asn Thr Lys Val Thr Ser Leu Ala Cys Pro Lys Phe Val Pro He Val 
65 70 75 80 



Glu Ser Asn Gin He His Ser Ser Val Ala Lys Lys He Val Phe Glu 
85 90 95 



Thr Leu Leu Pro Leu Lys Asn Lys His Leu Asp Thr Leu He Leu 
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100 105 110 

<210> 65 
<211> 341 
<212> DNA 
<213> E. hirae 

<400> 65 

atcgcatgta ataccgctac tgcggttgct ttagaagaaa tcaaggcggc acttcctatt 
60 

ccagtcattg gtgtgatctt acctgggaca agagcagctg ttaaacaaac aagaaataaa 
120 

caagtaggga ttatcggaac cctcggaacg atcaaaagtc gtgcttatga aacagcgctg 
180 

aaaacgaagg tacctgaact tgccgtgact agtttggctt gtccaaaatt cgtttcggta 
240 

gtggaaagta atgaatatca ttcgtcagtg gcaaaaaaaa tcgttgccca gacactagcg 
300 

ccattggtta ctaagaaaat cgatacgttg attttaggtt g 
341 

<210> 66 
<211> 111 
<212> PRT 
<213> E . hirae 

<400> 66 

Ala Cys Asn Thr Ala Thr Ala Val Ala Leu Glu Glu lie Lys Ala Ala 
15 10 15 



Leu Pro lie Pro Val lie Gly Val lie Leu Pro Gly Thr Arg Ala Ala 
20 25 30 



Val Lys Gin Thr Arg Asn Lys Gin Val Gly He He Gly Thr Leu Gly 
35 40 45 



Thr He Lys Ser Arg Ala Tyr Glu Thr Ala Leu Lys Thr Lys Val Pro 
50 55 60 



Glu Leu Ala Val Thr Ser Leu Ala Cys Pro Lys Phe Val Ser Val Val 
65 70 75 80 



Glu Ser Asn Glu Tyr His Ser Ser Val Ala Lys Lys He Val Ala Gin 
85 90 95 



Thr Leu Ala Pro Leu Val Thr Lys Lys He Asp Thr Leu He Leu 
100 105 110 



ASZD-PO 1-007 



<210> 67 

<211> 29 

<212> DNA 

<213> Artificial Sequence 



<220> 

<223> primer 

<4O0> 67 

aaatagtcat atgaaaatag gcgtttttg 
29 

<210> 68 

<211> 28 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> primer 

<400> 68 

agaattctat tacaatttga gccattct 
28 

<210> 69 

<211> 26 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> primer 

<400> 69 

gcgaattcga tcagaatttt ttttct 
26 

<210> 70 

<211> 26 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> primer 

<400> 70 

ataagtactt gtgaatctta tactag 
26 

<210> 71 

<211> 29 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> primer 



<400> 



71 
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aaaatgctag taatcgcatg taataccgc 
29 

<210> 72 
5 <211> 26 
<212> DNA 

<213> Artificial Sequence 

<220> 
10 <223> primer 

<400> 72 

tgggtacaac ctaaaatcaa cgtatc 
26 

<210> 73 
<211> 765 
<212> DNA 

<213> Aquifex pyrophilus NA sequence 
<400> 73 

atgaagatag gtatctttga cagtggtgtg gggggactta ctgttctaaa ggctataaga 
60 

25 aatagataca gaaaggttga tatagtatac ctcggtgata ccgcaagggt tccctacggc 
120 



15 



20 



30 



45 



ataaggtcta aagatacgat aatcagatac tcccttgagt gtgcgggctt tttaaaggat 
180 

aagggtgttg atataatcgt cgttgcctgc aataccgcaa gtgcttacgc tcttgaacgt 
240 



ttaaagaaag agataaacgt tcccgttttc ggcgttattg aacccggggt taaagaagcc 

35 300 

ttaaaaaagt caaggaataa aaaaatagga gttataggaa ctcctgcaac cgtaaaaagc 
360 

40 ggagcctacc agagaaagct tgaagagggg ggagctgatg tttttgcaaa ggcctgtccc 
420 



ctattcgttc cccttgcgga ggaaggtctc cttgaggggg agataacaag aaaggttgta 
480 

gaacactacc ttaaggagtt taaaggtaag attgatactc tgattttagg atgtacccat 
540 



tacccccttc ttaaaaagga gataaagaag tttttgggag acgttgaagt cgttgactct 
50 600 

tccgaagccc tttccctttc cctccataac tttataaagg acgatgggtc ctcatccctt 
660 

55 gagttatttt ttacggacct ttccccaaat ctccagtttt tgattaaatt aatactcggt 
720 

agggattacc cggtaaaact tgcggagggg gtttttacac attaa 
765 
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<210> 74 

<211> 262 

<212> PRT 

5 <213> Aquifex pyrophilus amino acid sequence 

<400> 74 



Met Lys lie Gly lie Phe Asp Ser Gly Val Gly Gly Leu Thr Val Leu 
10 1 5 10 15 



Lys Ala lie Arg Asn Arg Tyr Arg Lys Val Asp lie Val Tyr Leu Gly 
20 25 30 

15 

Asp Thr Ala Arg Val Pro Tyr Gly lie Arg Ser Lys Asp Phe Thr Thr 
35 40 45 

20 

He He Arg Tyr Ser Leu Glu Cys Ala Gly Phe Leu Lys Asp Lys Gly 
50 55 60 



25 Val Asp He He Val Val Ala Cys Asn Thr Ala Ser Ala Tyr Ala Leu 
65 70 75 80 



Glu Arg Leu Lys Lys Glu He Asn Val Pro Val Phe Gly Val He Glu 
30 85 90 95 



Pro Gly Val Lys Glu Ala Leu Lys Lys Ser Phe Thr Arg Asn Lys Lys 
100 - 105 110 

35 

He Gly Val He Gly Thr Pro Ala Thr Val Lys Ser Gly Ala Tyr Gin 
115 120 125 

40 

Arg Lys Leu Glu Glu Gly Gly Ala Asp Val Phe Ala Lys Ala Cys Pro 
130 135 140 



45 Leu Phe Val Pro Leu Ala Glu Glu Gly Leu Leu Glu Gly Glu He Thr 
145 150 155 160 



Arg Lys Val Val Glu His Tyr Phe Thr Leu Lys Glu Phe Lys Gly Lys 
50 165 170 175 



He Asp Thr Leu He Leu Gly Cys Thr His Tyr Pro Leu Leu Lys Lys 
180 185 190 

55 

Glu He Lys Lys Phe Leu Gly Asp Val Glu Val Val Asp Ser Ser Glu 
195 200 205 
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Ala Leu Ser Leu Ser Leu His Asn Phe lie Lys Asp Asp Gly Ser Ser 
210 215 220 



Ser Leu Glu Leu Phe Thr Phe Phe Thr Asp Leu Ser Pro Asn Leu Gin 
225 230 235 240 



Phe Leu lie Lys Leu lie Leu Gly Arg Asp Tyr Pro Val Lys Leu Ala 
245 250 255 



Glu Gly Val Phe Thr His 
260 

<210> 75 
<211> 19 
<212> DNA 
<213> primer 

<400> 75 

tgatgcaaca aatggacga 
19 

<210> 76 

<211> 18 

<212> DNA 

<213> primer 

<400> 76 

ttacaatttg agccattc 
18 



